瓜农老梁

一个想分享点干货的家伙,微信公众号「瓜农老梁」

0%

前言

为了报文传输更小、更快,在HTTP/2中Header头是经过压缩的,使用的压缩算法为HPACK。本文先通过Wireshark抓包截图直观感受下头部压缩效果,进而分析下这种压缩算法是如何工作的。

阅读全文 »

前言

HTTP/2在传输数据之前,先建立连接,建立HTTP/2连接的标记为Client发送连接前言Magic。HTTP/2属于应用层,位于TPC/IP及安全传输层协议TLS之上。在建立HTTP/2连接的过程中,会先后经历TCP握手、TLS握手、HTTP/2连接前言。下图网络分层图示:

阅读全文 »

本文盘点下到Kafka 2.4.1版本以来的一些亮点,这些亮点或笔者实际中踩过的坑、或可能将来会在实践中使用、或个人关注的,点击官方发布日志连接查看全貌。

0.11.0.3

0.11.0.2于2017年11月17日发布;0.11.0.3于2018年6月2日发布修订版本。

其中修复了0.11.0.2以前的一个BUG,该Bug曾导致过生产事故;即堆内存不能正常回收,频繁Full GC。详见:Kafka(0.11.0.2版本)堆内存不能正常回收问题分析【实战笔记】[KAFKA-6307]

0.11.0.3官方发布日志

阅读全文 »

retries参数说明

参数的设置通常是一种取舍,看下retries参数在版本0.11.3说明:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Setting a value greater than zero will cause the client to resend

any record whose send fails with a potentially transient error.

Note that this retry is no different than if the client resent the

record upon receiving the error.

Allowing retries without setting max.in.flight.requests.per.connection to 1 will potentially change

the ordering of records because if two batches are sent to a single

partition, and the first fails and is retried but the second succeeds,

then the records in the second batch may appear first.

备注:当发送失败时客户端会进行重试,重试的次数由retries指定,此参数默认设置为0。即:快速失败模式,当发送失败时由客户端来处理后续是否要进行继续发送。如果设置retries大于0而没有设置max.in.flight.requests.per.connection=1则意味着放弃发送消息的顺序性。

阅读全文 »

目录

1
2
3
4
5
系统相关指标
GC相关指标
JVM相关指标
Topic相关指标
Broker相关指标

系统相关指标

系统信息收

java.lang:type=OperatingSystem

1
{"freePhysicalMemorySize":"806023168","maxFileDescriptorCount":"4096","openFileDescriptorCount":"283","processCpuLoad":"0.0017562901839817224","systemCpuLoad":"0.014336627412954635","systemLoadAverage":"0.37"}

Thread信息收集

java.lang:type=Threading

1
{"peakThreadCount":"88","threadCount":"74"}

获取mmaped和direct空间

通过BufferPoolMXBean获取used、capacity、count

阅读全文 »

问题描述

当集群中新增加节点时,需要对已有的topic的副本进行迁移,以平衡流量。以公司集群扩增两个节点broker 4和broker 5为例说明操作过程。

问题:怎么做才能做到平滑呢?即尽量做到客户端应用无感知。

为了解决平滑问题,分为三步完成

1.副本均衡设置

​ 对Topic的副本平均分配到各个broker上

2.偏好副本设置

​ 将偏好副本平均分配到各个broker上, 为Leader均衡做准备

3.Leader均衡

​ 执行Leader平衡

阅读全文 »

升级记录

消息格式没有变化,只需要更改borker版本即可。

下载2.2.1客户端,保持原有节点配置不变增加设置0.11版本,逐台重启机器,在流入流出正常后再重启下一台,重要客户端做好观察。

inter.broker.protocol.version=0.11.0

全部正常后,再设置broker版本到2.2,逐台重启机器,在流入流出正常后再重启下一台,重要客户端做好观察。

inter.broker.protocol.version=2.2

阅读全文 »

问题描述

短信报警堆内存GC后依然超过4G内存,跟上篇文章所说情况相同。只是上次情况告警短信没发出来。这次介入前,dump了该节点的堆照,方便定位引起的问题。

告警GC日志,回收后依然在4G内存,回收前后只减少了几百M。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
2019-05-26T20:41:52.086+0800: 12768164.084: [GC pause (G1 Evacuation Pause) (young), 0.0296753 secs]
[Parallel Time: 27.0 ms, GC Workers: 28]
[GC Worker Start (ms): Min: 12768164084.0, Avg: 12768164084.2, Max: 12768164084.5, Diff: 0.5]
[Ext Root Scanning (ms): Min: 18.1, Avg: 19.0, Max: 19.9, Diff: 1.8, Sum: 532.0]
[Update RS (ms): Min: 1.2, Avg: 1.7, Max: 2.3, Diff: 1.1, Sum: 47.6]
[Processed Buffers: Min: 1, Avg: 2.4, Max: 8, Diff: 7, Sum: 68]
[Scan RS (ms): Min: 1.2, Avg: 1.8, Max: 2.1, Diff: 0.9, Sum: 49.6]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Object Copy (ms): Min: 3.1, Avg: 3.9, Max: 4.6, Diff: 1.4, Sum: 110.6]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.4]
[Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 28]
[GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.6]
[GC Worker Total (ms): Min: 26.2, Avg: 26.5, Max: 26.7, Diff: 0.5, Sum: 742.2]
[GC Worker End (ms): Min: 12768164110.7, Avg: 12768164110.7, Max: 12768164110.8, Diff: 0.1]
[Code Root Fixup: 0.1 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.7 ms]
[Other: 1.9 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.6 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.3 ms]
[Humongous Register: 0.1 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 0.4 ms]
[Eden: 400.0M(400.0M)->0.0B(400.0M) Survivors: 8192.0K->8192.0K Heap: 4426.3M(8192.0M)->4024.2M(8192.0M)]
[Times: user=0.75 sys=0.00, real=0.03 secs]
阅读全文 »

情况分析

客户端异常报警

晚上10点20分接到使用方电话,日志持续报以下异常,持续时间已有10多分钟。

1
2
3
ERROR 2019-05-15 23:05:23,221 [kafka-producer-network-thread | producer-1] A failure occurred sending a message to Kafka.
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.
WARN 2019-05-15 23:05:23,237 [main] Error sending event to listener kafkahandler, status: ABEND, event: Commit transaction
阅读全文 »