Spring boot gateway 网关压测时配置异常,导致服务返回405

背景

网关选用的是Spring-boot-gateway,网关配置使用nacos持久化。

Spring的网关源码分析

通过网关的源码分析,StripPrefixGatewayFilterFactory 通过parts值来截断请求的前缀。正常的是1,在执行过程中被刷新成了5。导致截断请求错误。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
@Override
public void start() {
if (this.running.compareAndSet(false, true)) {
# 30s 触发一次定时任务
this.watchFuture = this.taskScheduler.scheduleWithFixedDelay(
this::nacosServicesWatch, this.properties.getWatchDelay());
}
}
public void nacosServicesWatch() {
// 此处nacos 啥都没干, 就强制发送一个更新命令, 正常情况需要判断 service 是否存在更新
// 因此此处可以作为修复bug的触发点
// nacos doesn't support watch now , publish an event every 30 seconds.
this.publisher.publishEvent(
new HeartbeatEvent(this, nacosWatchIndex.getAndIncrement()));
}

路由配置刷新机制

Spring gateway 启动后会缓存路由配置,并且,每隔30秒会从缓存刷新配置到具体的路由执行类。
在刷新路由配置的方法里加入了监控日志:
RouteDefinitionRouteLocator.loadGatewayFilters

1
2
3
4
5
6
7
8
9
10
11
12
13
ConfigurationUtils.bind(configuration, properties,
factory.shortcutFieldPrefix(), definition.getName(), validator);

if (configuration instanceof StripPrefixGatewayFilterFactory.Config) {
StripPrefixGatewayFilterFactory.Config stripConfig = (StripPrefixGatewayFilterFactory.Config)configuration;
if (stripConfig.getParts() > 2) {
String errorMessage = "parts 异常:" + stripConfig.getParts()
+ "definition:" + definition
+ "properties:" + properties
+ "id:" + id;
log.error(errorMessage, new RuntimeException(errorMessage));
}
}

properties 是个Map,存放的是待刷新的part值,configuration 是被刷新的配置类,有一个属性 int parts。通过网关通过ConfigurationUtils.bind 来注入 configuration中的parts值。
以上代码修改后的异常日志:

1
2
3
4
5
6
7
2022-07-15 13:46:38,233 - parts 异常:5definition:FilterDefinition{name='StripPrefix', args={parts=1}}properties:{parts=1}id:pay-api-web
java.lang.RuntimeException: parts 异常:5definition:FilterDefinition{name='StripPrefix', args={parts=1}}properties:{parts=1}id:pay-api-web
at org.springframework.cloud.gateway.route.RouteDefinitionRouteLocator.loadGatewayFilters(RouteDefinitionRouteLocator.java:183)
at org.springframework.cloud.gateway.route.RouteDefinitionRouteLocator.getFilters(RouteDefinitionRouteLocator.java:212)
at org.springframework.cloud.gateway.route.RouteDefinitionRouteLocator.convertToRoute(RouteDefinitionRouteLocator.java:143)
at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:100)
at reactor.core.publisher.FluxFlatMap$FlatMapMain.drainLoop(FluxFlatMap.java:664)

可以看到在绑定后的 值是5. 但是properties里的是 {parts=1}。 说明bug发生在

1
2
ConfigurationUtils.bind(configuration, properties,
factory.shortcutFieldPrefix(), definition.getName(), validator);

这一行ConfigurationUtils.bind 调用的是Spring 底层JavaBeanBinder的bind方法
进一步分析JavaBeanBinder.bind

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
private <T> boolean bind(BeanSupplier<T> beanSupplier, BeanPropertyBinder propertyBinder, BeanProperty property) {
String propertyName = property.getName();
ResolvableType type = property.getType();
Supplier<Object> value = property.getValue(beanSupplier);
Annotation[] annotations = property.getAnnotations();
//这行在设置后返回bound=5。有错!
Object bound = propertyBinder.bindProperty(propertyName,
Bindable.of(type).withSuppliedValue(value).withAnnotations(annotations));
if (bound == null) {
return false;
}
if (property.isSettable()) {
property.setValue(beanSupplier, bound);
}
else if (value == null || !bound.equals(value.get())) {
throw new IllegalStateException("No setter found for property: " + property.getName());
}
return true;
}
1
2
Object bound = propertyBinder.bindProperty(propertyName,
Bindable.of(type).withSuppliedValue(value).withAnnotations(annotations));

JavaBeanBinder 是spring boot底层bean属性处理类。

bind过程分析

propertyBinder.bindProperty的后续调用链路
->Binder.bind
->BindConverter.cover
->TypeConverterSupport.convertIfNecessary
->TypeConverterSupport.doConvertValue
->TypeConverterSupport.doConvertTextValue

doConvertTextValue方法有两行代码

1
2
editor.setAsText(newTextValue);
return editor.getValue();

editor是通过PropertyEditorRegistrySupport.createDefaultEditors()初始化,一个类型一个对象。在执行的时候通过propertyEditorRegistry.getDefaultEditor(requiredType)获取。 如果requiredType相同,获取的就是同一个对象。
所以当有2次调用获取的editor相同,就可能有并发问题。时序如下:
A:editor.setAsText(1)
B:editor.setAsText(5)
A: return editor.getValue();
此时A获取到的就是5。与期望的值不同。造成执行错误。

错误分析总结

  1. spring cloud gateway和nacos30秒一次心跳,每次心跳会从内存中刷新路由规则到执行对象。
    刷新过程调用的是ConfigurationUtils.bind方法,此方法依赖的PropertyEditor。对象对于每个类型是单例的,如果同时有2个相同类型的值进行bind,可能产生并发问题。
  2. 压测时gateway的负载高。RouteDefinitionRouteLocator.getRoutes()方法并发调用RouteDefinitionRouteLocator.loadGatewayFilters。
  3. RouteDefinitionRouteLocator.loadGatewayFilters依赖的方法ConfigurationUtils.bind有并发问题 导致路由配置错乱。

附件

重现方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
import com.google.common.collect.Lists;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.commons.lang3.StringUtils;
import org.junit.Before;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.BeansException;
import org.springframework.beans.factory.BeanFactory;
import org.springframework.beans.factory.BeanFactoryAware;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.cloud.gateway.filter.FilterDefinition;
import org.springframework.cloud.gateway.filter.GatewayFilter;
import org.springframework.cloud.gateway.filter.OrderedGatewayFilter;
import org.springframework.cloud.gateway.filter.factory.GatewayFilterFactory;
import org.springframework.cloud.gateway.filter.factory.StripPrefixGatewayFilterFactory;
import org.springframework.cloud.gateway.filter.factory.StripPrefixGatewayFilterFactory.Config;
import org.springframework.cloud.gateway.support.ConfigurationUtils;
import org.springframework.cloud.gateway.support.HasRouteId;
import org.springframework.core.Ordered;
import org.springframework.expression.spel.standard.SpelExpressionParser;
import org.springframework.test.context.ActiveProfiles;
import org.springframework.test.context.junit4.SpringRunner;
import org.springframework.validation.Validator;
/**
* @author littlehui
* @version 1.0
* @description TODO
* @date 2022/7/18 22:45
*/
@RunWith(SpringRunner.class)
@SpringBootTest()
@ActiveProfiles("local")
public class RouteDefinitionRouteLocatorTest implements BeanFactoryAware {

Logger logger = LoggerFactory.getLogger(RouteDefinitionRouteLocatorTest.class);

@Autowired
private Validator validator;

private final SpelExpressionParser parser = new SpelExpressionParser();

private BeanFactory beanFactory;

@Autowired
private List<GatewayFilterFactory> gatewayFilterFactoryList;

private final Map<String, GatewayFilterFactory> gatewayFilterFactories = new HashMap<>();


@Test
public void concurrentTest() {

/**
* assume we have below gateway route config
*
- id: r1
uri: lb://service1
predicates:
- Path=/gateway/auth/**
filters:
- StripPrefix=1

- id: r2
uri: lb://service2
predicates:
- Path=/gateway/api/business/**
filters:
- StripPrefix=2
* then we can construct the FilterDefinition to mock this config
*/

FilterDefinition f1 = new FilterDefinition();
f1.setName("StripPrefix");
f1.setArgs(new HashMap<>());
f1.getArgs().put("_genkey_0", "1");

FilterDefinition f2 = new FilterDefinition();
f2.setName("Retry");
f2.setArgs(new HashMap<>());
f2.getArgs().put("retries", "5");

Thread t1 = new Thread(new Runnable() {
@Override
public void run() {
while (!Thread.currentThread().isInterrupted()) {
loadGatewayFilters("r1", Lists.newArrayList(f1));
}
}
}, "prefix1");

Thread t2 = new Thread(new Runnable() {
@Override
public void run() {
while (!Thread.currentThread().isInterrupted()) {
loadGatewayFilters("r2", Lists.newArrayList(f2));
}
}
}, "prefix2");

Thread t22 = new Thread(new Runnable() {
@Override
public void run() {
while (!Thread.currentThread().isInterrupted()) {
loadGatewayFilters("r2", Lists.newArrayList(f2));
}
}
}, "prefix2");

Thread t11 = new Thread(new Runnable() {
@Override
public void run() {
while (!Thread.currentThread().isInterrupted()) {
loadGatewayFilters("r1", Lists.newArrayList(f1));
}
}
}, "prefix1");

t1.start();
t2.start();
t11.start();
t22.start();

while (true) {
if (t1.isInterrupted() || t11.isInterrupted() || t2.isInterrupted() || t22.isInterrupted()) {
break;

}
}

}


List<GatewayFilter> loadGatewayFilters(String id,
List<FilterDefinition> filterDefinitions) {
ArrayList<GatewayFilter> ordered = new ArrayList<>(filterDefinitions.size());
for (int i = 0; i < filterDefinitions.size(); i++) {
FilterDefinition definition = filterDefinitions.get(i);
GatewayFilterFactory factory = this.gatewayFilterFactories
.get(definition.getName());
if (factory == null) {
throw new IllegalArgumentException(
"Unable to find GatewayFilterFactory with name "
+ definition.getName());
}
Map<String, String> args = definition.getArgs();
if (logger.isDebugEnabled()) {
logger.debug("RouteDefinition " + id + " applying filter " + args + " to "
+ definition.getName());
}

Map<String, Object> properties = factory.shortcutType().normalize(args,
factory, this.parser, this.beanFactory);

Object configuration = factory.newConfig();

/* //加锁解决
synchronized (this) {
ConfigurationUtils.bind(configuration, properties,
factory.shortcutFieldPrefix(), definition.getName(), validator);
}*/

ConfigurationUtils.bind(configuration, properties,
factory.shortcutFieldPrefix(), definition.getName(), validator);
// some filters require routeId
// TODO: is there a better place to apply this?
if (configuration instanceof HasRouteId) {
HasRouteId hasRouteId = (HasRouteId) configuration;
hasRouteId.setRouteId(id);
}

GatewayFilter gatewayFilter = factory.apply(configuration);

//asset statement start
if (configuration instanceof StripPrefixGatewayFilterFactory.Config) {
int parts = ((Config) configuration).getParts();
if (StringUtils.equals("r1", id) && !StringUtils.equals("1", String.valueOf(parts))) {
logger.error("for router id r1,expect parts is 1,but actual is {}", parts);
Thread.currentThread().interrupt();
} else if (StringUtils.equals("r2", id) && !StringUtils
.equals("5", String.valueOf(parts))) {
logger.error("for router id r2,expect parts is 2,but actual is {}", parts);
Thread.currentThread().interrupt();
}
}
//asset statement end

if (gatewayFilter instanceof Ordered) {
ordered.add(gatewayFilter);
} else {
ordered.add(new OrderedGatewayFilter(gatewayFilter, i + 1));
}
}

return ordered;
}
@Override
public void setBeanFactory(BeanFactory beanFactory) throws BeansException {
this.beanFactory = beanFactory;
}

@Before
public void init() {
gatewayFilterFactoryList.forEach(
factory -> this.gatewayFilterFactories.put(factory.name(), factory));
}

}