Skip to content

Commit d88603d

Browse files
committed
more wiki imports
1 parent 64e8513 commit d88603d

File tree

10 files changed

+2246
-1
lines changed

10 files changed

+2246
-1
lines changed

content/advisories/_index.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
+++
22
title = 'Advisories'
33
date = 2024-09-02T11:18:58-07:00
4-
weight = 4
4+
weight = 5
55
+++
6+
7+
This section contains general advisories or notes relating to specific
8+
applications that use memcached, or security related notices.
9+
10+
- [Tuning Grafana Loki with Extstore](/advisories/grafanaloki/)
11+
- [DDoS attacks on old open servers](/advisories/ddos/)

content/advisories/ddos.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
+++
2+
title = 'UDP DDoS'
3+
date = 2024-09-04T14:58:40-07:00
4+
+++
5+
6+
## Introduction
7+
8+
An amplification attack against memcached operates similarly to all DDoS amplification attacks such as [NTP or DNS amplification](https://en.wikipedia.org/wiki/Denial-of-service_attack#Amplification).
9+
The attack works by sending spoofed requests to a vulnerable server, which then responds with a larger amount of data than the initial request, magnifying the volume of traffic.
10+
11+
This method of amplification attack is possible because memcached servers have the option to operate using the UDP protocol.
12+
UDP is a network protocol that allows for the sending of data without first getting what’s known as a handshake, which is a network process where both sides agree to the communication.
13+
UDP is utilized because the targeted host is never consulted on whether or not they’re willing to receive the data, allowing for a massive amount of data to be sent to the target without their prior consent.
14+
15+
## How does an attack with memcached work?
16+
17+
1. An attacker SETs a large item into an exposed memcached server.
18+
1. Next the attacker spoofs a GET request with the IP address of the targeted victim.
19+
1. The vulnerable memcached server that receives the request, sends a large response to the target.
20+
1. The targeted server or its surrounding infrastructure is unable to process the large amount of data sent from the memcached server, resulting in overload and denial-of-service to legitimate requests.
21+
22+
## Mitigation
23+
24+
### Disable UDP
25+
26+
For memcached servers, make sure to disable UDP support if you do not need it.
27+
UDP is disabled by default on versions 1.5.6 and later.
28+
29+
It can be disabled by adding `-U 0` to your start arguments. If you use a
30+
config file, it may look like:
31+
32+
You can disable UDP by adding the following line to your `/etc/memcached.conf`:
33+
34+
```conf
35+
# Disable UDP protocol
36+
-U 0
37+
```
38+
39+
Save `/etc/memcached.conf` and restart the service (On Ubuntu using `service memcached restart`).
40+
You can check it worked by using `netstat`:
41+
42+
```text
43+
$ sudo netstat -plunt
44+
45+
Active Internet connections (only servers)
46+
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
47+
tcp 0 0 127.0.0.1:11211 0.0.0.0:* LISTEN 16079/memcached
48+
```
49+
50+
As you can see, `memcached` is listening using only TCP.
51+
52+
### Firewall memcached servers
53+
54+
If your memcached server is only used by the local machine, you can limit the listening address to 127.0.0.1.
55+
If other machines need to connect from a private network, force the listening on a private IP (for example 10.0.0.0.1, to be adapted to your network class).
56+
57+
You can change those settings in the file `/etc/memcached.conf`:
58+
59+
```conf
60+
# Only listen on localhost
61+
--listen 127.0.0.1
62+
```
63+
64+
By default, Ubuntu and Debian bind Memcached to the local interface `127.0.0.1`.
65+
You can also use firewalls to block external traffic from reaching your memcached server.

content/advisories/grafanaloki.md

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
+++
2+
title = 'Loki with Extstore'
3+
date = 2024-09-04T14:58:37-07:00
4+
+++
5+
6+
## Granfa Loki with Extstore
7+
8+
You may have seen this excellent blog post: https://grafana.com/blog/2023/08/23/how-we-scaled-grafana-cloud-logs-memcached-cluster-to-50tb-and-improved-reliability/
9+
10+
... and are now attempting to make use of this knowledge, but something isn't
11+
working quite right. This document will give you a quick start in tuning Loki
12+
and Extstore to work well together.
13+
14+
## TLDR
15+
16+
We assume your Loki chunk storage size is 1.5mb
17+
18+
For memcached, add at least the following tuning options:
19+
20+
`-I 2m -o ext_wbuf_size=32,ext_threads=10,ext_max_sleep=10000,slab_automove_freeratio=0.10,ext_recache_rate=0`
21+
22+
IE, your full start line may look like:
23+
24+
`memcached -m 6000 -I 2m -o ext_path=/disk/extstore:500G,ext_wbuf_size=32,ext_threads=10,ext_max_sleep=10000,slab_automove_freeratio=0.10,ext_recache_rate=0`
25+
26+
*Please set -m and ext_path appropraitely for your system*. Leave some RAM for
27+
your system to breathe and a little disk space overhead.
28+
29+
Please use version 1.6.21 or newer as it improves the extstore write speed and
30+
fixes some related bugs.
31+
32+
In your loki configuration:
33+
34+
```
35+
chunk_store_config:
36+
chunk_cache_config:
37+
memcached:
38+
batch_size: 3
39+
parallelism: 2
40+
memcached_client:
41+
addresses: 127.0.0.1:11211
42+
timeout: 60s
43+
background:
44+
writeback_goroutines: 1
45+
writeback_buffer: 1000
46+
writeback_size_limit: 500MB
47+
```
48+
49+
NOTE: `batch_size` can be set to 2x the number of memcached servers you have.
50+
So if you have 3 servers, 6 should work. Keep `parallelism` as low as
51+
possible, but increase this value if you are not maxing out network usage on
52+
memcached.
53+
54+
Loki's default configuration is very aggressive, which is normally fine for
55+
memory backed memcached. However extstore needs a little more time to fetch or
56+
write to disk.
57+
58+
Finally, please check that your memcached instances and loki instances aren't
59+
swapping (out of RAM) or out of CPU, as this can make query times longer and
60+
cause timeouts.
61+
62+
---
63+
64+
### Why we have to tune these settings
65+
66+
Loki's defaults assume both A) A RAM backed cluster, and B) potentially a
67+
_large_ cluster made of tens to dozens to hundreds of cache nodes. Many users
68+
are trying a low number of memcached nodes with extstore (1-3).
69+
70+
When Loki fetches keys from a _pool_ of memcached servers, it will fetch a
71+
`batch_size` of keys to the _entire pool_ all at once. If you have a
72+
`batch_size` of 500 and 40 memcached servers, _each memcached_ will receive
73+
12-14 keys at the same time, as the batch is split across them.
74+
75+
If you are fetching `500` keys against `1` server, that is a much larger batch
76+
of keys hitting a single server. There are other issues with this but this
77+
document will not discuss them for now.
78+
79+
### Memcached tuning discussion
80+
81+
The defaults for extstore are fairly conservative. Most of the performance
82+
improvement you will see is from raising `ext_threads`, which allows it to
83+
fully utilize an SSD.
84+
85+
If you have a particularly fast SSD, the thread count can be further raised to
86+
20 or 30.
87+
88+
The number of memcached worker threads is specified with `-t` and defaults to
89+
`4`, do _not_ set this higher than the number of CPU's your server has.
90+
91+
The rest of the tunings help provide minor speedups.
92+
93+
### Loki tuning discussion
94+
95+
Above we discuss the `batch_size` problem. You can also see `timeout` is set
96+
very high. We run into issues for a few reasons:
97+
98+
- Loki's memcache client timeout is measuring the amount of time to _fetch and
99+
read and process the entire batch of keys from each host_.
100+
101+
If you are fetching 2 keys from one host (3MB of data), the 100ms default might seem okay. However, retrieving 500 1.5 megabyte keys over the network from SSD on one host might take quite a while. If you have fewer memcached hosts, or your Loki server does not have a lot of CPU to process results quickly, this timeout will need to be set very high.
102+
103+
We will update this document if this changes.
104+
105+
- `writeback_goroutines: 1`
106+
107+
This defaults to 10. Loki will aggressively write all of the data it fetches
108+
from a backing store _back to memcached_ as each query runs. Memcached keeps
109+
some memory reserved as a buffer to give it time to flush data to disk. If it
110+
cannot write to disk fast enough, you will see the `evictions` counter
111+
increase.
112+
113+
There are not a lot of good options at the moment, but setting this value to
114+
`1` will help minimize the impact. If you still have trouble with `evictions`
115+
you may need to scale up to a faster memcached instance or add more instances.
116+
117+
We will update this document if anything changes.

content/protocols/_index.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
+++
2+
title = 'Protocols'
3+
date = 2024-09-04T14:38:41-07:00
4+
weight = 4
5+
+++
6+
7+
Memcached supports a basic Text and Meta Text protocol. There is also a
8+
deprecated "binary" which no longer receives updates.
9+
The Meta protocol is more efficient than the older binary protocol, is cross
10+
compatible with the Text protocol. It is recommended for all new clients.
11+
12+
* [Text and Meta Protocol](http://github.com/memcached/memcached/blob/master/doc/protocol.txt)
13+
* [Meta Examples](/protocols/meta/)
14+
* [Binary Protocol](/protocols/binary/)
15+
* [Slides on binary protocol ](http://www.slideshare.net/tmaesaka/memcached-binary-protocol-in-a-nutshell-presentation/) by Toru Maesaka (2008)
16+
17+
Further, there are sub protocols and proposals
18+
19+
* [Binary protocol SASL Authentication](/protocols/binarysasl/)
20+
21+
## Why is the binary protocol deprecated?
22+
23+
The binary protocol was introduced in 2008. It seemed like a good idea but it
24+
had many issues:
25+
26+
- Security issues
27+
- Poor efficiency prevented adoption by large users
28+
- Limited features and poor extensibility
29+
- SASL authentication rarely worked and caused a lot of support headaches.
30+
- Difficult to implement as a client.
31+
32+
We introduced the meta protocol in 2019:
33+
34+
- Better wire efficiency
35+
- More features (anti-stampeding herd, stale-while-revalidate, etc)
36+
- Small command set (get/set/delete/math)
37+
- Extensible (flags and tokens)
38+
- Simple and easy to implement
39+
40+
More information can be found in `protocol.txt` linked above.

content/protocols/basic.md

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
+++
2+
title = 'Basic Text Protocol'
3+
date = 2024-09-04T14:48:15-07:00
4+
weight = 1
5+
+++
6+
7+
Memcached handles a small number of basic commands.
8+
9+
Full documentation can be found in the [Protocol Documentation](/protocols/).
10+
11+
### Standard Protocol
12+
13+
The "standard protocol stuff" of memcached involves running a command against an "item". An item consists of:
14+
15+
* A key (arbitrary string up to 250 bytes in length. No space or newlines for ASCII mode)
16+
* A 32bit "flag" value
17+
* An expiration time, in seconds. '0' means never expire. Can be up to 30 days. After 30 days, is treated as a unix timestamp of an exact date.
18+
* A 64bit "CAS" value, which is kept unique.
19+
* Arbitrary data
20+
21+
CAS is optional (can be disabled entirely with `-C`, and there are more fields that internally make up an item, but these are what your client interacts with.
22+
23+
#### No Reply
24+
25+
Most ASCII commands allow a "noreply" version. One should not normally use this with the ASCII protocol, as it is impossible to align errors with requests. The intent is to avoid having to wait for a return packet after executing a mutation command (such as a set or add).
26+
27+
The binary protocol properly implements noreply (quiet) statements. If you have a client which supports or uses the binary protocol, odds are good you may take advantage of this.
28+
29+
### Storage Commands
30+
31+
#### set
32+
33+
Most common command. Store this data, possibly overwriting any existing data. New items are at the top of the LRU.
34+
35+
#### add
36+
37+
Store this data, only if it does not already exist. New items are at the top of the LRU. If an item already exists and an add fails, it promotes the item to the front of the LRU anyway.
38+
39+
#### replace
40+
41+
Store this data, but only if the data already exists. Almost never used, and exists for protocol completeness (set, add, replace, etc)
42+
43+
#### append
44+
45+
Add this data after the last byte in an existing item. This does not allow you to extend past the item limit. Useful for managing lists.
46+
47+
#### prepend
48+
49+
Same as append, but adding new data before existing data.
50+
51+
#### cas
52+
53+
Check And Set (or Compare And Swap). An operation that stores data, but only if no one else has updated the data since you read it last. Useful for resolving race conditions on updating cache data.
54+
55+
### Retrieval Commands
56+
57+
#### get
58+
59+
Command for retrieving data. Takes one or more keys and returns all found items.
60+
61+
#### gets
62+
63+
An alternative get command for using with CAS. Returns a CAS identifier (a unique 64bit number) with the item. Return this value with the `cas` command. If the item's CAS value has changed since you `gets`'ed it, it will not be stored.
64+
65+
### delete
66+
67+
Removes an item from the cache, if it exists.
68+
69+
### incr/decr
70+
71+
Increment and Decrement. If an item stored is the string representation of an unsigned 64bit integer, you may run incr or decr commands to modify that number. You may only incr by positive values, or decr by positive values. They do not accept negative values.
72+
73+
If a value does not already exist, incr/decr will fail.
74+
75+
### Statistics
76+
77+
There're a handful of commands that return counters and settings of the memcached server. These can be inspected via a large array of tools or simply by telnet or netcat. These are further explained in the protocol docs.
78+
79+
#### stats
80+
81+
ye 'ole basic stats command.
82+
83+
#### stats items
84+
85+
Returns some information, broken down by slab, about items stored in memcached.
86+
87+
#### stats slabs
88+
89+
Returns more information, broken down by slab, about items stored in memcached. More centered to performance of a slab rather than counts of particular items.
90+
91+
#### stats sizes
92+
93+
A special command that shows you how items would be distributed if slabs were broken into 32byte buckets instead of your current number of slabs. Useful for determining how efficient your slab sizing is.
94+
95+
*WARNING* this is a development command. As of 1.4 it is still the only command which will lock your memcached instance for some time. If you have many millions of stored items, it can become unresponsive for several minutes. Run this at your own risk. It is roadmapped to either make this feature optional or at least speed it up.
96+
97+
### flush_all
98+
99+
Invalidate all existing cache items. Optionally takes a parameter, which means to invalidate all items after N seconds have passed.
100+
101+
This command does not pause the server, as it returns immediately. It does not free up or flush memory at all, it just causes all items to expire.

0 commit comments

Comments
 (0)