rancher部署skywalking9.5

结果

k8s部署一套skywalking,多namespace共享,sidecar方式接入.轻松实现全链路监控

镜像下载到内网

docker pull apache/skywalking-oap-server:9.5.0
#打标上传
docker tag bb81e785d6b7 registry.cn-hangzhou.aliyuncs.com/earic/skywalking-oap-server:9.5.0
docker push registry.cn-hangzhou.aliyuncs.com/earic/skywalking-oap-server:9.5.0

自定义路径部署

参考

Skywalking在9.0+版本后重做了前端UI,叫做“Booster UI”,以前的“Rocketbot UI”被弃用

下载skywalking-booster-ui

可以直接下载已发布的源码,如:9.5.0

9.5.0

改代码

  • vite.config.ts

    vite.config.ts 下面增加:  base: "./", // 类似publicPath,'./'避免打包访问后空白页面,要加上,不然线上也访问不了
    

  • src/router/index.ts

src/router/index.ts   两处 createWebHistory  改为  createWebHashHistory

打包产出静态文件

cypress安装

解决Unzipping Cypress 0% 0s 。依赖nodejs版本,当前下载的是v18.17.0

https://nodejs.org/download/release/latest-v18.x/

执行命令
npm i
npm run build-only

image-20230801163711719

打镜像

mkdir -p /mnt/d/publish/skw && cd  /mnt/d/publish/skw
# 将上面dist文件移动到当前目录
cp -r /mnt/d/tmp/skywalking-booster-ui-9.5.0/dist/ .

cat <<EOF > passwd
cyk:\$apr1\$anOYsKSJ\$P2RT/hf0OHzuEyWciCsdZ1
EOF

cat <<EOF > web.conf
server {
	listen 80;
	server_name  _;
	error_log /usr/local/openresty/nginx/logs/skw_error.log crit;
	access_log /usr/local/openresty/nginx/logs/skw_access.log;

    #新增下面两行
    auth_basic "Please input password"; #这里是验证时的提示信息 
    auth_basic_user_file /usr/local/src/nginx/passwd;

	index index.html index.htm;
	location / {
		alias /usr/local/openresty/nginx/html/skw/;
		proxy_set_header X-Real-IP \$remote_addr;
		proxy_set_header X-Forwarded-For \$proxy_add_x_forwarded_for;
		index index.html index.htm;
		try_files \$uri \$uri/ /index.html;
	}

    # 禁止访问目录但允许访问文件
      location /css/{
                    root /usr/local/openresty/nginx/html/;
                    autoindex off;
                    proxy_store on;
            }
       location /img/{
                      root /usr/local/openresty/nginx/html/;
                      autoindex off;
                      proxy_store on;
              }
       location /js/{
                      root /usr/local/openresty/nginx/html/;
                      autoindex off;
                      proxy_store on;
              }


	error_page 500 502 503 504 /50x.html;
}
EOF
dos2unix web.conf

cat <<EOF > Dockerfile
FROM registry.cn-hangzhou.aliyuncs.com/earic/openresty:1.21.4.1-alpine
MAINTAINER wwj


COPY dist/  /usr/local/openresty/nginx/html/skw
COPY passwd/  /usr/local/src/nginx/passwd

#COPY web.conf /usr/local/openresty/nginx/conf/nginx.conf
COPY web.conf /etc/nginx/conf.d/default.conf
EXPOSE 80
EOF


## 解决mediaType in manifest should be ‘application/vnd.docker.distribution.man
export BUILDAH_FORMAT=docker

podman build  --no-cache -t registry.cn-hangzhou.aliyuncs.com/earic/skw:$(date +"%Y-%m-%d_%H-%M-%S") .
podman push registry.cn-hangzhou.aliyuncs.com/earic/skw:2023-08-02_10-18-55

rancher部署

配置映射

  • skw-alarm
apiVersion: v1
kind: ConfigMap
metadata:
  name: skw-alarm
  annotations:
    {}
#    key: string
  labels:
    {}
#    key: string
  namespace: cyk-uat
data:
  alarm-settings.yml: |-
    rules:
      # Rule unique name, must be ended with `_rule`.
      service_resp_time_rule:
        metrics-name: service_resp_time
        op: ">"
        threshold: 1000
        period: 10
        count: 3
        silence-period: 5
        message: Response time of service {name} is more than 1000ms in 3 minutes of last 10 minutes.
      service_sla_rule:
        # Metrics value need to be long, double or int
        metrics-name: service_sla
        op: "<"
        threshold: 8000
        # The length of time to evaluate the metrics
        period: 10
        # How many times after the metrics match the condition, will trigger alarm
        count: 2
        # How many times of checks, the alarm keeps silence after alarm triggered, default as same as period.
        silence-period: 3
        message: Successful rate of service {name} is lower than 80% in 2 minutes of last 10 minutes
      service_resp_time_percentile_rule:
        # Metrics value need to be long, double or int
        metrics-name: service_percentile
        op: ">"
        threshold: 1000,1000,1000,1000,1000
        period: 10
        count: 3
        silence-period: 5
        message: Percentile response time of service {name} alarm in 3 minutes of last 10 minutes, due to more than one condition of p50 > 1000, p75 > 1000, p90 > 1000, p95 > 1000, p99 > 1000
      service_instance_resp_time_rule:
        metrics-name: service_instance_resp_time
        op: ">"
        threshold: 1000
        period: 10
        count: 2
        silence-period: 5
        message: Response time of service instance {name} is more than 1000ms in 2 minutes of last 10 minutes
      database_access_resp_time_rule:
        metrics-name: database_access_resp_time
        threshold: 1000
        op: ">"
        period: 10
        count: 2
        message: Response time of database access {name} is more than 1000ms in 2 minutes of last 10 minutes
      endpoint_relation_resp_time_rule:
        metrics-name: endpoint_relation_resp_time
        threshold: 1000
        op: ">"
        period: 10
        count: 2
        message: Response time of endpoint relation {name} is more than 1000ms in 2 minutes of last 10 minutes
    dingtalkHooks:
      textTemplate: |-
        {
          "msgtype": "text",
          "text": {
            "content": "Apache SkyWalking Alarm: \n %s."
          }
        }
      webhooks:
        - url: https://oapi.dingtalk.com/robot/send?access_token=5258e3472e467162a944d92d306423c5a78ff10b0c03b0b092465fdde47ad119
__clone: true

  • skw-config

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: skw-config
      annotations:
        {}
      labels:
        {}
      namespace: cyk-uat
    data:
      SW_STORAGE: elasticsearch
      SW_NAMESPACE: skywalking-index
      SW_STORAGE_ES_CLUSTER_NODES: 10.128.159.50:9200
      SW_ES_USER: elastic
      SW_STORAGE_DAY_STEP: '1'
      SW_STORAGE_ES_INDEX_REPLICAS_NUMBER: '0'
      SW_ES_PASSWORD: 5NXWhVgg3ulraED1TnXu
      SW_HEALTH_CHECKER: default
      SW_TELEMETRY: none
      SW_TELEMETRY_PROMETHEUS_HOST: 0.0.0.0
      SW_TELEMETRY_PROMETHEUS_PORT: '1234'
      SW_PROMETHEUS_FETCHER_ACTIVE: 'true'
      TZ: Asia/Shanghai
      JAVA_OPTS: '-Xms1g -Xmx1g -Duser.timezone'
    __clone: true
    

升级后elasticsearch7要改elasticsearch,否则启动报错

CRT证书转JKS证书

如果没用https的es,跳过

#crt转为p12证书
openssl pkcs12 -export -in ca.crt -inkey ca.key -out keystore.p12 -name "alias"
#p12 to jks
keytool -importkeystore -srckeystore keystore.p12 -destkeystore keystore.jks -deststoretype pkcs12
#jks to p12
keytool -importkeystore -srckeystore keystore.jks -srcstoretype JKS -deststoretype PKCS12 -destkeystore keystore.p12

如果提示keytool不存在

sudo yum install java-1.8.0-openjdk-devel

pem证书转pkcs12

keytool -import -v -trustcacerts -file ca.pem  -keystore es_keystore.jks -keypass changeit -storepass changeit

https指定证书用

SW_SW_STORAGE_ES_SSL_JKS_PATH=/nfs/keystore.jks

实际没用Https的es,单独搭建了http的es

部署oap

pod设置存储

镜像设置

部署skw

rancher部署skywalking9.5完整操作

有没有和jaeger的对比