Redis String #
Redis is an in-memory data store, it keeps all data in memory. Although it’s possible to have hundreds of GB memory for a server, memory is still limited and more expensive compared with hard disk. So Redis does its best to compact data it holds, to save memory space.
Redis String can be encoded as:
- raw, OBJ_ENCODING_RAW
- int, OBJ_ENCODING_INT
- embstr, OBJ_ENCODING_EMBSTR
The structures are defined in sds.h, in order to save space, Redis defines five similar structs to hold byte array.
The struct to use depends on the byte array’s size:
- if its size is less then
2^8,sdshdr8is used, it needs 3 extra bytes. - if its size is less then
2^16,sdshdr16is used, it needs 5 extra bytes. - if its size is less then
2^32,sdshdr32is used, it needs 9 extra bytes. - if its size is less then
2^64,sdshdr64is used, it needs 17 extra bytes.
typedef char *sds;
/* Note: sdshdr5 is never used, we just access the flags byte directly.
* However is here to document the layout of type 5 SDS strings. */
struct __attribute__ ((__packed__)) sdshdr5 {
unsigned char flags; /* 3 lsb of type, and 5 msb of string length */
char buf[];
};
struct __attribute__ ((__packed__)) sdshdr8 {
uint8_t len; /* used */
uint8_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
struct __attribute__ ((__packed__)) sdshdr16 {
uint16_t len; /* used */
uint16_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
struct __attribute__ ((__packed__)) sdshdr32 {
uint32_t len; /* used */
uint32_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
struct __attribute__ ((__packed__)) sdshdr64 {
uint64_t len; /* used */
uint64_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
int/OBJ_ENCODING_INT #
If an redisObject’s encoding is OBJ_ENCODING_INT, the value will be store in ptr, other than using ptr to point to the value, as a pointer takes exactly 8 bytes, the same size as a long long value.

Here is the code snippet from t_string.c to handle incr, decr command,
you may notice the line o->ptr = (void*)((long)value):
// t_string.c
// some irrelevant code is removed
void incrDecrCommand(client *c, long long incr) {
long long value, oldvalue;
robj *o, *new;
o = lookupKeyWrite(c->db,c->argv[1]);
if (checkType(c,o,OBJ_STRING)) return;
if (getLongLongFromObjectOrReply(c,o,&value,NULL) != C_OK) return;
oldvalue = value;
if ((incr < 0 && oldvalue < 0 && incr < (LLONG_MIN-oldvalue)) ||
(incr > 0 && oldvalue > 0 && incr > (LLONG_MAX-oldvalue))) {
addReplyError(c,"increment or decrement would overflow");
return;
}
value += incr;
if (o && o->refcount == 1 && o->encoding == OBJ_ENCODING_INT &&
(value < 0 || value >= OBJ_SHARED_INTEGERS) &&
value >= LONG_MIN && value <= LONG_MAX)
{
new = o;
o->ptr = (void*)((long)value);
} else {
new = createStringObjectFromLongLongForValue(value);
if (o) {
dbOverwrite(c->db,c->argv[1],new);
} else {
dbAdd(c->db,c->argv[1],new);
}
}
}
raw/OBJ_ENCODING_RAW #
Raw string uses the above structs to store data.
Creating a raw string involves two structs, redisObject and sdshdrx, x could be 8, 16, 32, 64.
The data structure in memory look like this:

The method to create raw string is in object.c:
// object.c
/* Create a string object with encoding OBJ_ENCODING_RAW, that is a plain
* string object where o->ptr points to a proper sds string. */
robj *createRawStringObject(const char *ptr, size_t len) {
return createObject(OBJ_STRING, sdsnewlen(ptr,len));
}
robj *createObject(int type, void *ptr) {
robj *o = zmalloc(sizeof(*o));
o->type = type;
o->encoding = OBJ_ENCODING_RAW;
o->ptr = ptr;
o->refcount = 1;
/* Set the LRU to the current lruclock (minutes resolution), or
* alternatively the LFU counter. */
if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
o->lru = (LFUGetTimeInMinutes()<<8) | LFU_INIT_VAL;
} else {
o->lru = LRU_CLOCK();
}
return o;
}
embstr/OBJ_ENCODING_EMBSTR #
When creating raw string, if the length is less than 44, Redis will create a embstr other than a raw string.
Why? because it’s more efficient.
To create a raw string, Redis has to allocate memory twice, one for redisObject, the other for sdshdrx. but embstr only needs one time memory allocation, let’s take a look how Redis create a embstr:
// object.c
/* Create a string object with encoding OBJ_ENCODING_EMBSTR, that is
* an object where the sds string is actually an unmodifiable string
* allocated in the same chunk as the object itself. */
robj *createEmbeddedStringObject(const char *ptr, size_t len) {
robj *o = zmalloc(sizeof(robj)+sizeof(struct sdshdr8)+len+1);
struct sdshdr8 *sh = (void*)(o+1);
o->type = OBJ_STRING;
o->encoding = OBJ_ENCODING_EMBSTR;
o->ptr = sh+1;
o->refcount = 1;
if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
o->lru = (LFUGetTimeInMinutes()<<8) | LFU_INIT_VAL;
} else {
o->lru = LRU_CLOCK();
}
sh->len = len;
sh->alloc = len;
sh->flags = SDS_TYPE_8;
if (ptr == SDS_NOINIT)
sh->buf[len] = '\0';
else if (ptr) {
memcpy(sh->buf,ptr,len);
sh->buf[len] = '\0';
} else {
memset(sh->buf,0,len+1);
}
return o;
}
Redis allocates a consecutive memory to hold both redisObject and sdshdrx, this is also good for memory management, as redisObject takes 16 bytes, and sdshdr8 takes 3 bytes, two many small memory blocks is hard to manage.
So the memory layout is:

When creating a string object, Redis will create different encoding type objects depends on given string’s length.
// object.c
/* Create a string object with EMBSTR encoding if it is smaller than
* OBJ_ENCODING_EMBSTR_SIZE_LIMIT, otherwise the RAW encoding is
* used.
*
* The current limit of 44 is chosen so that the biggest string object
* we allocate as EMBSTR will still fit into the 64 byte arena of jemalloc. */
#define OBJ_ENCODING_EMBSTR_SIZE_LIMIT 44
robj *createStringObject(const char *ptr, size_t len) {
if (len <= OBJ_ENCODING_EMBSTR_SIZE_LIMIT)
return createEmbeddedStringObject(ptr,len);
else
return createRawStringObject(ptr,len);
}
Why is Redis embstr’s max length 44 bytes? #
First of all, redis string is null terminated, to be compatible with C builtin functions, such as printf(), so when creating a redis string, Redis will always add a \n at the end, which takes an extra byte.
As described in method createStringObject, in order to fin into the 64 byte arena of jemalloc. And each redis string takes 20 extra bytes(16 bytes for redisObject, 3 bytes for sdshdr8, and 1 byte for \n), there are (64-20=)44 bytes left.
Why does Redis want to keep the memory chunk size less than 64 bytes? #
We all know that both too small or too big memory blocks are bad to memory management(Yes, that’s memory, hard to maintain.). I guess 64 is a empirical value.
You may visit jemalloc homepage to learn more about memory management.