memcached
diff --git a/‎content/advisories/_index.md
Lines changed: 7 additions & 1 deletion b/‎content/advisories/_index.md
Lines changed: 7 additions & 1 deletion
diff --git a/‎content/advisories/ddos.md
Lines changed: 65 additions & 0 deletions b/‎content/advisories/ddos.md
Lines changed: 65 additions & 0 deletions
diff --git a/‎content/advisories/grafanaloki.md
Lines changed: 117 additions & 0 deletions b/‎content/advisories/grafanaloki.md
Lines changed: 117 additions & 0 deletions
diff --git a/‎content/protocols/_index.md
Lines changed: 40 additions & 0 deletions b/‎content/protocols/_index.md
Lines changed: 40 additions & 0 deletions
diff --git a/‎content/protocols/basic.md
Lines changed: 101 additions & 0 deletions b/‎content/protocols/basic.md
Lines changed: 101 additions & 0 deletions
@@ -1,5 +1,11 @@
 +++
 title = 'Advisories'
 date = 2024-09-02T11:18:58-07:00
-weight = 4
+weight = 5
 +++
+
+This section contains general advisories or notes relating to specific
+applications that use memcached, or security related notices.
+
+- [Tuning Grafana Loki with Extstore](/advisories/grafanaloki/)
+- [DDoS attacks on old open servers](/advisories/ddos/)
@@ -0,0 +1,65 @@
++++
+title = 'UDP DDoS'
+date = 2024-09-04T14:58:40-07:00
++++
+
+## Introduction
+
+An amplification attack against memcached operates similarly to all DDoS amplification attacks such as [NTP or DNS amplification](https://en.wikipedia.org/wiki/Denial-of-service_attack#Amplification).
+The attack works by sending spoofed requests to a vulnerable server, which then responds with a larger amount of data than the initial request, magnifying the volume of traffic.
+
+This method of amplification attack is possible because memcached servers have the option to operate using the UDP protocol.
+UDP is a network protocol that allows for the sending of data without first getting what’s known as a handshake, which is a network process where both sides agree to the communication.
+UDP is utilized because the targeted host is never consulted on whether or not they’re willing to receive the data, allowing for a massive amount of data to be sent to the target without their prior consent.
+
+## How does an attack with memcached work?
+
+1. An attacker SETs a large item into an exposed memcached server.
+1. Next the attacker spoofs a GET request with the IP address of the targeted victim.
+1. The vulnerable memcached server that receives the request, sends a large response to the target.
+1. The targeted server or its surrounding infrastructure is unable to process the large amount of data sent from the memcached server, resulting in overload and denial-of-service to legitimate requests.
+
+## Mitigation
+
+### Disable UDP
+
+For memcached servers, make sure to disable UDP support if you do not need it.
+UDP is disabled by default on versions 1.5.6 and later.
+
+It can be disabled by adding `-U 0` to your start arguments. If you use a
+config file, it may look like:
+
+You can disable UDP by adding the following line to your `/etc/memcached.conf`:
+
+```conf
+# Disable UDP protocol
+-U 0
+```
+
+Save `/etc/memcached.conf` and restart the service (On Ubuntu using `service memcached restart`).
+You can check it worked by using `netstat`:
+
+```text
+$ sudo netstat -plunt
+
+Active Internet connections (only servers)
+Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
+tcp        0      0 127.0.0.1:11211         0.0.0.0:*               LISTEN      16079/memcached
+```
+
+As you can see, `memcached` is listening using only TCP.
+
+### Firewall memcached servers
+
+If your memcached server is only used by the local machine, you can limit the listening address to 127.0.0.1.
+If other machines need to connect from a private network, force the listening on a private IP (for example 10.0.0.0.1, to be adapted to your network class).
+
+You can change those settings in the file `/etc/memcached.conf`:
+
+```conf
+# Only listen on localhost
+--listen 127.0.0.1
+```
+
+By default, Ubuntu and Debian bind Memcached to the local interface `127.0.0.1`.
+You can also use firewalls to block external traffic from reaching your memcached server.
@@ -0,0 +1,117 @@
++++
+title = 'Loki with Extstore'
+date = 2024-09-04T14:58:37-07:00
++++
+
+## Granfa Loki with Extstore
+
+You may have seen this excellent blog post: https://grafana.com/blog/2023/08/23/how-we-scaled-grafana-cloud-logs-memcached-cluster-to-50tb-and-improved-reliability/
+
+... and are now attempting to make use of this knowledge, but something isn't
+working quite right. This document will give you a quick start in tuning Loki
+and Extstore to work well together.
+
+## TLDR
+
+We assume your Loki chunk storage size is 1.5mb
+
+For memcached, add at least the following tuning options:
+
+`-I 2m -o ext_wbuf_size=32,ext_threads=10,ext_max_sleep=10000,slab_automove_freeratio=0.10,ext_recache_rate=0`
+
+IE, your full start line may look like:
+
+`memcached -m 6000 -I 2m -o ext_path=/disk/extstore:500G,ext_wbuf_size=32,ext_threads=10,ext_max_sleep=10000,slab_automove_freeratio=0.10,ext_recache_rate=0`
+
+*Please set -m and ext_path appropraitely for your system*. Leave some RAM for
+your system to breathe and a little disk space overhead.
+
+Please use version 1.6.21 or newer as it improves the extstore write speed and
+fixes some related bugs.
+
+In your loki configuration:
+
+```
+chunk_store_config:
+  chunk_cache_config:
+    memcached:
+      batch_size: 3
+      parallelism: 2
+    memcached_client:
+      addresses: 127.0.0.1:11211
+      timeout: 60s
+    background:
+      writeback_goroutines: 1
+      writeback_buffer: 1000
+      writeback_size_limit: 500MB
+```
+
+NOTE: `batch_size` can be set to 2x the number of memcached servers you have.
+So if you have 3 servers, 6 should work. Keep `parallelism` as low as
+possible, but increase this value if you are not maxing out network usage on
+memcached.
+
+Loki's default configuration is very aggressive, which is normally fine for
+memory backed memcached. However extstore needs a little more time to fetch or
+write to disk.
+
+Finally, please check that your memcached instances and loki instances aren't
+swapping (out of RAM) or out of CPU, as this can make query times longer and
+cause timeouts.
+
+---
+
+### Why we have to tune these settings
+
+Loki's defaults assume both A) A RAM backed cluster, and B) potentially a
+_large_ cluster made of tens to dozens to hundreds of cache nodes. Many users
+are trying a low number of memcached nodes with extstore (1-3).
+
+When Loki fetches keys from a _pool_ of memcached servers, it will fetch a 
+`batch_size` of keys to the _entire pool_ all at once. If you have a
+`batch_size` of 500 and 40 memcached servers, _each memcached_ will receive
+12-14 keys at the same time, as the batch is split across them.
+
+If you are fetching `500` keys against `1` server, that is a much larger batch
+of keys hitting a single server. There are other issues with this but this
+document will not discuss them for now.
+
+### Memcached tuning discussion
+
+The defaults for extstore are fairly conservative. Most of the performance
+improvement you will see is from raising `ext_threads`, which allows it to
+fully utilize an SSD.
+
+If you have a particularly fast SSD, the thread count can be further raised to
+20 or 30.
+
+The number of memcached worker threads is specified with `-t` and defaults to
+`4`, do _not_ set this higher than the number of CPU's your server has.
+
+The rest of the tunings help provide minor speedups.
+
+### Loki tuning discussion
+
+Above we discuss the `batch_size` problem. You can also see `timeout` is set
+very high. We run into issues for a few reasons:
+
+- Loki's memcache client timeout is measuring the amount of time to _fetch and
+  read and process the entire batch of keys from each host_.
+
+If you are fetching 2 keys from one host (3MB of data), the 100ms default might seem okay. However, retrieving 500 1.5 megabyte keys over the network from SSD on one host might take quite a while. If you have fewer memcached hosts, or your Loki server does not have a lot of CPU to process results quickly, this timeout will need to be set very high.
+
+We will update this document if this changes.
+
+- `writeback_goroutines: 1`
+
+This defaults to 10. Loki will aggressively write all of the data it fetches
+from a backing store _back to memcached_ as each query runs. Memcached keeps
+some memory reserved as a buffer to give it time to flush data to disk. If it
+cannot write to disk fast enough, you will see the `evictions` counter
+increase.
+
+There are not a lot of good options at the moment, but setting this value to
+`1` will help minimize the impact. If you still have trouble with `evictions`
+you may need to scale up to a faster memcached instance or add more instances.
+
+We will update this document if anything changes.
@@ -0,0 +1,40 @@
++++
+title = 'Protocols'
+date = 2024-09-04T14:38:41-07:00
+weight = 4
++++
+
+Memcached supports a basic Text and Meta Text protocol. There is also a
+deprecated "binary" which no longer receives updates.
+The Meta protocol is more efficient than the older binary protocol, is cross
+compatible with the Text protocol. It is recommended for all new clients.
+
+ * [Text and Meta Protocol](http://github.com/memcached/memcached/blob/master/doc/protocol.txt)
+ * [Meta Examples](/protocols/meta/)
+ * [Binary Protocol](/protocols/binary/)
+   * [Slides on binary protocol ](http://www.slideshare.net/tmaesaka/memcached-binary-protocol-in-a-nutshell-presentation/) by Toru Maesaka (2008)
+
+Further, there are sub protocols and proposals
+
+ * [Binary protocol SASL Authentication](/protocols/binarysasl/)
+
+## Why is the binary protocol deprecated?
+
+The binary protocol was introduced in 2008. It seemed like a good idea but it
+had many issues:
+
+- Security issues
+- Poor efficiency prevented adoption by large users
+- Limited features and poor extensibility
+- SASL authentication rarely worked and caused a lot of support headaches.
+- Difficult to implement as a client.
+
+We introduced the meta protocol in 2019:
+
+- Better wire efficiency
+- More features (anti-stampeding herd, stale-while-revalidate, etc)
+- Small command set (get/set/delete/math)
+- Extensible (flags and tokens)
+- Simple and easy to implement
+
+More information can be found in `protocol.txt` linked above.
@@ -0,0 +1,101 @@
++++
+title = 'Basic Text Protocol'
+date = 2024-09-04T14:48:15-07:00
+weight = 1
++++
+
+Memcached handles a small number of basic commands.
+
+Full documentation can be found in the [Protocol Documentation](/protocols/).
+
+### Standard Protocol
+
+The "standard protocol stuff" of memcached involves running a command against an "item". An item consists of:
+
+ * A key (arbitrary string up to 250 bytes in length. No space or newlines for ASCII mode)
+ * A 32bit "flag" value
+ * An expiration time, in seconds. '0' means never expire. Can be up to 30 days. After 30 days, is treated as a unix timestamp of an exact date.
+ * A 64bit "CAS" value, which is kept unique.
+ * Arbitrary data
+
+CAS is optional (can be disabled entirely with `-C`, and there are more fields that internally make up an item, but these are what your client interacts with.
+
+#### No Reply
+
+Most ASCII commands allow a "noreply" version. One should not normally use this with the ASCII protocol, as it is impossible to align errors with requests. The intent is to avoid having to wait for a return packet after executing a mutation command (such as a set or add).
+
+The binary protocol properly implements noreply (quiet) statements. If you have a client which supports or uses the binary protocol, odds are good you may take advantage of this.
+
+### Storage Commands
+
+#### set
+
+Most common command. Store this data, possibly overwriting any existing data. New items are at the top of the LRU.
+
+#### add
+
+Store this data, only if it does not already exist. New items are at the top of the LRU. If an item already exists and an add fails, it promotes the item to the front of the LRU anyway.
+
+#### replace
+
+Store this data, but only if the data already exists. Almost never used, and exists for protocol completeness (set, add, replace, etc)
+
+#### append
+
+Add this data after the last byte in an existing item. This does not allow you to extend past the item limit. Useful for managing lists.
+
+#### prepend
+
+Same as append, but adding new data before existing data.
+
+#### cas
+
+Check And Set (or Compare And Swap). An operation that stores data, but only if no one else has updated the data since you read it last. Useful for resolving race conditions on updating cache data.
+
+### Retrieval Commands
+
+#### get
+
+Command for retrieving data. Takes one or more keys and returns all found items.
+
+#### gets
+
+An alternative get command for using with CAS. Returns a CAS identifier (a unique 64bit number) with the item. Return this value with the `cas` command. If the item's CAS value has changed since you `gets`'ed it, it will not be stored.
+
+### delete
+
+Removes an item from the cache, if it exists.
+
+### incr/decr
+
+Increment and Decrement. If an item stored is the string representation of an unsigned 64bit integer, you may run incr or decr commands to modify that number. You may only incr by positive values, or decr by positive values. They do not accept negative values.
+
+If a value does not already exist, incr/decr will fail.
+
+### Statistics
+
+There're a handful of commands that return counters and settings of the memcached server. These can be inspected via a large array of tools or simply by telnet or netcat. These are further explained in the protocol docs.
+
+#### stats
+
+ye 'ole basic stats command.
+
+#### stats items
+
+Returns some information, broken down by slab, about items stored in memcached.
+
+#### stats slabs
+
+Returns more information, broken down by slab, about items stored in memcached. More centered to performance of a slab rather than counts of particular items.
+
+#### stats sizes
+
+A special command that shows you how items would be distributed if slabs were broken into 32byte buckets instead of your current number of slabs. Useful for determining how efficient your slab sizing is.
+
+*WARNING* this is a development command. As of 1.4 it is still the only command which will lock your memcached instance for some time. If you have many millions of stored items, it can become unresponsive for several minutes. Run this at your own risk. It is roadmapped to either make this feature optional or at least speed it up.
+
+### flush_all
+
+Invalidate all existing cache items. Optionally takes a parameter, which means to invalidate all items after N seconds have passed.
+
+This command does not pause the server, as it returns immediately. It does not free up or flush memory at all, it just causes all items to expire.