Use Elasticsearch as today, not as day before yesterday.
If you work with elasticsearch, I expect that you read official suggesting about heap size. If haven’t let me summarise it:
- Use no more than 32 Gb heap, and better a lot of small machines. If you have a big machine — just run more than one java with heap bellow 32 Gb.
- Keep half memory free to VFS/slab cache — they call it ‘lucene memory’.
- To save heap size force enable UseCompressedOops but in August 2015 it was historical suggesting.
- and if you read deep, you can find that G1 GC isn’t stable and better to use CMS. OpenJDK tracker contains only one G1 bug that related to Lucene, it was resolved in August 2016, and it happened only at 32bit java that supports heap up to 4Gb at the best case.
as result?
You waste a lot of memory on modern servers. Javas spend a lot of time to context switching because elastic uses a lot of running threads inside (hundreds on modern CPU!), and if you run 3 or 4 different JVM, you made think much more than the worst. GC pauses are seconds, sometimes up to a minute, or few minutes.
Sounds familiar, doesn’t it?
Footnotes about how run elasticsearch at a modern machine with 192Gb RAM and few terabytes of storage per data node:
- switch /sys/kernel/mm/transparent_hugepage/enabled and /sys/kernel/mm/transparent_hugepage/defrag to madvise
- Setup heap size to 135g or 128g or another reasonable value, where reasonable depends on how you’re using the cluster. Anyway, slabtop should help to make the right decision. You can also use pcstat, fincore or any another tools that is a wrapper over mincore(2) but it is very expensive at big storage.
- Change JVM GC settings to something likes -XX:+UseG1GC -XX:-UseCompressedOops -XX:+AlwaysPreTouch - XX:+UseTransparentHugePages -XX:+UseNUMA -XX:-UseBiasedLocking
- Don’t run more of one java per server.
Sounds easy, doesn’t it? Just don’t afraid and think.
Bellow, you can find status at one cluster that was partially reconfigured with my recommendation. An es1 is the original machine which has 3 java with heap a bit bellow 32Gb, es2 is an experimental machine with settings above.
How you can see CPU usage and load average is very different. At same cluster, at same requests. GC pauses also decreased dozen times, and as side effect latency improved.