Long time nothing new, recently i started my own company LambdaCu.be and I was massively busy. If you want to hire me ping me at kuba@lambdacu.be =)
I have in pipeline a lot texts about lua scripting in redis and using it to build some tools but can’t find time to finish this stuff ;/.
Auto failover
Every database wants to have auto failover mechanism. This is a great marketing pitch! haha :) Main thing about is that one of your server can go down and you still are operating as normal and when he will go up again everything is fine.. unless your routing server will go down :D ofc.
2.4.16 / 2.6
Since Redis 2.4 and 2.6 there was this idea of adding it. Antirez wrote a draft spec for it and implemented it as experimental feature. It is really well described here http://redis.io/topics/sentinel so i will just write a short note how did i setup it and how does it feel.
Setup
While preparing to this demo i did everything on master 0ee3f05518e081640c1c6f9ae52c3a414f0feaceso what i did was simply start “master” and “replica servers” with this configs
(ofc turn daemonize to yes in production lol)
Standard Master setup with default script on port 6379 and replica with
Cool works great :D The only thing that worried me was that when i turned on master after failover (it took 8 sec) he did not pickup he is slave and he did not start replicating data.
when you do this…
You will see beefy
12345
[36373] 26 Sep 21:15:03.441 # Error condition on socket for SYNC: Connection refused[36373] 26 Sep 21:15:04.521 * Connecting to MASTER...
[36373] 26 Sep 21:15:04.521 * MASTER <-> SLAVE sync started
[36373] 26 Sep 21:15:04.521 # Error condition on socket for SYNC: Connection refused[36373] 26 Sep 21:15:05.128 * MASTER MODE enabled (user request)
On the initial slave :) things just went from bad to good :D
Summary
This is cool new feature that you can have master-slave and auto failover server the only thing that driver have to do is if you get error while connecting / querying is to ask sentinel for new master connect and retry :) It is very basic but…
I like it!
THIS IS EXPERIMENTAL FEATURE and much more info about it you can finde here http://redis.io/topics/sentinel. Especially about pub/sub way of watching stuff / events while they occur.
Ont he official redis site http://redis.io you can find this http://redis.io/topics/twitter-clone/ post about building twitter clone in redis. I based my design post partially on it but i would like to go more deep into building timeline and posts.
Quick review
I used similar approach to store followers and following so i will just go fast through the keys and design.
12
twtr:<user_id>:follows -> set of ids this user follows
twtr:<user_id>:followers -> set of id's that follows this user
What happens when i click “follow”
example
12345678910111213141516
redis 127.0.0.1:6379> SADD twtr:kuba:following amelia
(integer) 1
redis 127.0.0.1:6379> SADD twtr:amelia:followers kuba
(integer) 1
redis 127.0.0.1:6379> SADD twtr:kuba:following dan
(integer) 1
redis 127.0.0.1:6379> SADD twtr:dan:followers kuba
(integer) 1
redis 127.0.0.1:6379> SADD twtr:kuba:following ben
(integer) 1
redis 127.0.0.1:6379> SADD twtr:ben:followers kuba
(integer) 1
redis 127.0.0.1:6379> SMEMBERS twtr:kuba:following
1)"ben"2)"amelia"3)"dan"
This way for each user we can see set of people who follow him and those who he follows. Thats all we need like in tutorial.
Post
So i think it is not waste if we will decide to keep post in form of 2 keys
123
twtr:<user_id>:post:<post_id> -> content text of post
twtr:<user_id>:post:<post_id>:created_at -> creation time of post
twtr:post_owner:<post_id> -> id of post creator.
Why this approach and not compacting all the things into pipe separate key? Both solutions seems to be ok, this just leaves little bit more flexibility. I know that it will generate 2 x times more pickups to redis then previous so you can consider doing
1
twtr:<user_id>:post:<post_id> -> (timestamp|text)
Both solutions have a pros and cones, first one will require 2 x lookups and second one will require parsing data in app layer. Still i prefer first one.
Post id
This is hard topic, because in future we would want to scale ( lol ) . Generating post id is not easy task in this case. We could just use auto incrementing counter like this
For each user we will store a list of posts he wrote. Initially i thought that we could just pump everything into list. But this is not optimal. This is single point which will grow like crazy and we will be not able to decide how and when to archive parts that are not used. Eg. posts did by user 8 months ago are not really relevant today because if we will make assumption that on average person posts few times a week this 8 month old entry will be way forgotten. We want to archive it, also it will be healthier for memory to store short lists.
I see here two scenarios:
user looks at his last few posts < 100
user is infinite scrolling through all posts.
So this scenario reasonable seems to have list of lists in which we will have ordered post lists id’s. if we will use only LPUSH to add posts lists tot his list we will be able to to do easy LRANGE 0 to get newest lists.
123
twtr:<user_id>:lists -> list of lits id's only LPUSH id and LRANGE 0 number. twtr:<user_id>:list_next -> auto incr counter for lists id's
twtr:<user_id>:list:<list_id> -> list with 100 posts
So how do we get most recent posts ? we just LRANGE 0 2 to get most recent two lists and next we will merge them first + second. Both are LPUSH’ed so should be semi ordered. (we don’t really care about order). adding stuff to time line is bit tricky.
we need to do it like this
LRANGE <key> 0 1 current list id, next we need to LLEN <key> to check size and if it is < SIZE ( for our example 100 ) we just LPUSH <key> <value> and job done, if size is > 100 we need to INCR <list counter> and LPUSH its result on list of lists and next we need to LPUSH <key> <value> on the new list.
And all of this in application layer. But this is the hardest bit to do. May seem to be complicated but if this seems to be not optimal you can add one more list
1
twtr:<user_id>:list:current -> list of current 100 posts
This list is just the most current posts of particular user. How does this list work ? Algorithm is simple
LPUSH new post id’s
RPOP if SIZE > 100
This could be useful to reduce number of hits you get against redis.
Time line
Now the time line. Time line is exactly the same as user post list. we will just one “bit” about adding posts.
Algorithm here is when you add a post you have to pick all id’s of people who follow you. (example from top if you are adding post as amelia)
1
SMEMBERS twtr:<user_id>:followers
And you need to push your post id into their time line posts list. Thats all. Ofc one thing that we need to add are keys for time line
1234
twtr:<user_id>:timeline:lists -> list of lits id's only LPUSH id and LRANGE 0 number. twtr:<user_id>:timeline:list_next -> auto incr counter for lists id's
twtr:<user_id>:timeline:list:<list_id> -> list with 100 posts
twtr:<user_id>:timeline:current -> LPUSH, RPOP > SIZE current list cache
Summary
This is how i would approach building twitter like clone. Things like old lists can be easily archived into mysql, postgres or other thing and EXPIREed from redis. One thing in my design is common that i put a lot into keys <user_id> this could be skiped but in my opinion it is not bad. IF you will use <user_id> in form of user email md5 you can use it directly to access gravatar of that user.
On average you will need to do around 10-30 hits into redis to get data if you plan to do it in a “lazy way” you can minimize number of hits to around 10.
If you see problem with my design comment i want to know it!. The core of this design is that each user post data is stored into one redis instance. This is important because of access and race condition stories if you will have many redis instances. But achieving “sharding” in application layer is not hard. Only thing that i would care about is post id generator. This is single point of failure because i have a strong assertion that post_id is unique in whole system.
Long time nothing new here so i will glue something together about stuff that we were talking about today with my friend Jarek. We talked about building backend for Todo app :). Yes simple todo app. How to build scalable backend. So my initial thought was “how i would design it in different databases”. (i’m taking only about data model)
Requirements
What we know:
User has some sort of id. (number, email, hash of something)
We need to be able to have different todo lists
User can choose his todo list and see tasks ( obvious )
User can tag tasks!
User can query tasks in list by tags
User can see all tags.
Design using Redis
How to do it with redis ? :)
First few facts i assumed at start. Single todo task has body and timestamps [created_at, updated_at] and base for key will be phrase “todoapp”.
So lets start with user and his list of todo lists :). This gives us first key
123
todo:<user_id>:todolist:next => auto incrementing counter for lists id
todo:<user_id>:todolists => [LIST]todo:<user_id>:todolist:<todo_list_id>:name => list name
Here we have two keys, first is list id counter that we will bump to get new list counter :), second is list of todolists ids. Why do it this way ? Well people can add and remove todo lists.
Hey ! we just added id of our first list to list of our todo lists (lots of list word here!). Ok so now lets add a task.
list:
12
todo:<user_id>:todolist:<todo_list_id>:next => auto incrementing counter for tasks id
todo:<user_id>:todolist:<todo_list_id> => [LIST]
and task:
123
todo:<user_id>:task:<task_id> => content of task eg. "finish blog post"todo:<user_id>:task:<task_id>:created_at => epoch time when it was created handled by app
todo:<user_id>:task:<task_id>:updated_at => epoch time when it was last updated handled by app
Ok so how to i add task to my list
adding task
12345678910
redis 127.0.0.1:6379> INCR todo:kuba:todolist:1:next
(integer) 1
redis 127.0.0.1:6379> LPUSH todo:kuba:todolist:1 1
(integer) 1
redis 127.0.0.1:6379> SET todo:kuba:task:1 "finish blog post"OK
redis 127.0.0.1:6379> SET todo:kuba:task:1:created_at 1343324314
OK
redis 127.0.0.1:6379> SET todo:kuba:task:1:updated_at 1343324315
OK
And we have our first task in. How do we get tasks from out todo list simple!
peeking task
123456
redis 127.0.0.1:6379> LRANGE todo:kuba:todolist:1 0 -1
1)"1"redis 127.0.0.1:6379> GET todo:kuba:task:1
"finish blog post"redis 127.0.0.1:6379> GET todo:kuba:task:1:created_at
"1343324314"
Ok so now we have very simple todo lists with tasks, well at least overview. Ofc you can use sets for todo lists or zsets but lets stay with lists to keep it simple for now.
How ro remove task from the list ?
removing task
1234
redis 127.0.0.1:6379> LREM todo:kuba:todolist:1 -1 1
(integer) 1
redis 127.0.0.1:6379> LRANGE todo:kuba:todolist:1 0 -1
(empty list or set)
Good, now we can add tasks, remove tasks, same sotry with adding todo lists and removing todo lists.
One last thing is to add tags!. Simply here each task will have list of tags and each tag will have list of tasks related with.
This example shows what we need to do to tag a task with something and how to peek tasks tagged with it. Why we have both lists ? To make it fast while searching. If user will click on particular tag like “redis” you want to get it O(1) time not O(N) after searching all keys. And same the other way, normal ui will pull task test, when it was created and tags to display so we want to have this data ready.
This is whole design for the todo app. We have 8 types of keys. Things like pagination, calculating time are all left to app layer. Important thing is that i scope everything to user key / id. This is because i want to isolate each user space easy. Each user in his own space will have short lists, there is no danger of “ultimate” non splittable lists.
Design using Mongodb
Well this case upfront is easier to grasp because for each list we can use single document or collection of documents lets talk about both solutions.
Todolist = Document
In this example we will use built in “array” operators
db.todolists.update({name:"House work"},{$push:{"tasks":{"name":"finish blog post","tags":["mongo"]}}})"ok">>db.todolists.find({name:"House work"})[{"name":"House work","_id":{"$oid":"50118742cc93742e0d0b6f7c"},"tasks":[{"name":"finish blog post","tags":["mongo"]}]}]
this will create empty todo list with name “House work” of course this way we will not omit building sub lists of tags etc, we have to build in a same way like in redis but as part of document. The story is exactly the same like above in redis. Mongodb lets us query nested documents and this will enable us to skip some of the extra “lists” while doing search.
Lets try it out how to find out mongo tagged entries?
find by tag
12345678
>db.todolists.find({"tasks.tags":{$in:["mongo"]}})[{"name":"House work","_id":{"$oid":"50118742cc93742e0d0b6f7c"},"tasks":[{"name":"finish blog post","tags":["mongo"]}]}]
This way we can find whole todolist with task that contains tag “mongo” but after that we will have to work out from the document in app the task that we are interested int. Using it like this we will have a document with structure like this
Using redis we could wrapp stuff into MULTI command while using mongodb “array” command we are a bit cowboishing. They could remove wrong stuff in we will not be cautious :). (well same in redis!) Big plus of Mongodb is native time type!
Todolist = many documents
Using this approach we can leverage more of our stuff on mongodb search in this approach each task will be a different document.
With structure like this
task structure
123456
{todo_list:"todo list id",user:"user id",text:"todo text",tags:["Tag1","Tag2"]}
This way we will have a lot of documents, more disk space consumption and still we will have to have second collection with with objects with structure like this
structure of todolist
123456
{todo_list:"todo list id",name:"todo list name",user:"user id",// tasks: [Tasks OBjectID Array] you could have this and remove todo_list id from tasks choice is yours :)}
And this way we can use find tool very easy and get documents fast.
Summary
All of this solutions have some pros and cons, mongodb excels better when documents are bigger (limit is set on 16 mb per document) than loads of small documents (massive waste of space). Solution in redis is really fast and if you will implement lazy loading it will be very fast. You can adjust this designs to your situation by changing lists to sets etc. The place where redis OWNS mongodb in this context is “strucutres” and we use a lot of them to store data like this, lists sets, zsets. Implementing priority list in mongodb will be totally custom solution while in redis we can just use zset.
This is just my point of view on this. I will supply some code to cover it more in part two. This is next problem, i’m sure solution in mongodb using things like mongoid http://mongoid.org will be much more developer friendly then building things “rawly” in redis hiredis client.
btw i jsut wrote this from “top of my head” so it may contain typos and i’m sure keys, structures can be optimized :) This is just to open discussion with my friend :)
Prototyping json / xml RESTfull api with rails is easy. Before we will want to rewrite it to something like erlang webmachine or node.js! For purpose of this we can use syntax introduced in rails 3. (ages ago) with respond_to/respond_with it is very cool.
Example controller
How we use this ? Lets take a peek and simple example:
Two things we will notice at start is respond_to :html, :xml, :json this is something like old merbs provides where we specify to which formats we want to respond. Second change is how we layout the action. All we have to do is to put into respond_with object we want to respond with.
What this buys for us ?
If we have html request to actions like create, update or destroy we want to redirect on success
if we have json, xml request to same type of “state changing” actions we want to render response with our format.
We achieve both of this with reponds_with in just one line. But lets take a peek at longer example.
In this example we can see example of create action where we add :location => venues_url this will in case of format: html success redirect to this url.
Summary
Using this helps to write stuff fast and readable, you can still use plain old respond_to in action with old format.html syntax..
After short talk with tomash and his gist https://gist.github.com/2871286 with performance of vibe.d i decided to write this post.
In my opinion pong tests are wrong and do not show real “performance”. I said to him to check sinatra with thin handler not webrick and that showed ~1.6k req/sec which is not bad at all. While at my box it was ~900 req/sec so significantly less (Macbook Pro i5).
From this gist we can see that his vibe.d benchmark set pong test at 8425.85 req/sec (He used ab and i used httperf).
Warp
My first candidate in this competition of pong tests is haskell warp handler.
code:
pingpong.hs
1234567891011121314151617181920212223242526
{-# LANGUAGE OverloadedStrings #-}importNetwork.WaiimportNetwork.Wai.Handler.WarpimportNetwork.HTTP.Types(status200)importBlaze.ByteString.Builder(copyByteString)importqualifiedData.ByteString.UTF8asBUimportData.MonoidimportData.Enumerator(run_,enumList,($$))main=doletport=8000putStrLn$"Listening on port "++showportrunportappappreq=return$casepathInforeqof["pong"]->pongx->indexxpong=ResponseBuilderstatus200[("Content-Type","text/plain")]$mconcat$mapcopyByteString["pong"]indexx=ResponseBuilderstatus200[("Content-Type","text/html")]$mconcat$mapcopyByteString["<p>Hello from ",BU.fromString$showx,"!</p>","<p><a href='/pong'>pong</a></p>\n"]
it is semi rack like syntax ;)
Benchmark results
I prepared results to show how big lie is pong test. Because in this type of test / showoff all you test is how fast really you can accept connections. Single thread will always win :). But lets look at results:
Multi threaded – 4 threads ( on for each core ) – Request rate: 10020.8 req/s (0.1 ms/req) Vibe.d die!!! Yeah!!!!
Single threaded – default compilation – Request rate: 13584.1 req/s (0.1 ms/req) Mother of God!
Tested with httperf --uri=/ --port=8000 --num-calls=10000 --num-conns=20 httperf command.
Summary
You can test it on your own, i did it on latest 7.4.1 Ghc from haskell platform on OSX 10.7. And post reply with your results :) maybe i missed something. Code, scripts to build and run are in repository https://github.com/JakubOboza/haskell-warp-pong-test.
So how to test ?!
I think you should test your application in default environment so with db behind it, but anyone in this scenario can say it is testing performance of db. But everyday… users are really testing performance of our db ;>… or the weakest of the elements in chain. So if your db / rendering engine is performing at level of 50 req / sec fast app handler will not turn it into 5000 req / sec.
First question when i need to add new blog to my code I ask is, do i remember all the boilerplate and how many times i will make a mistake this time. No more :). Rebar has a nice thing built in it is templating language that lets us build our own templates.
Custom template ?
I always end up looking at old projects and copying parts like gen_server and reusing them. I always knew rebar has option to write them but never had time to look at it. Today lol, i wanted to do some cleaning at home so every thing seems to be a good excuse to not do any cleaning :D.
What i need to know
Basic template is made from one or many .erl files written with mustache style { { } } code and .template file describing what to do with files.
%%% @author {{author_name }} <{{author_email}}>
%%% @copyright {{copyright_year }}{{author_name}}.
%%% @doc {{description }}-module({{name }}).
-behaviour(gen_server).
-author(' { {author_name } } <{ {author_email } }>').
-export([start_link/1]).
-export([init/1, handle_call/3, handle_cast/2, terminate/2, handle_info/2, code_change/3, stop/1]).
% public api
start_link(_Args) ->
gen_server:start_link({local, ?MODULE}, ?MODULE, [], []).
% state should be change with State that you will pass
init([]) ->
{ok, state}.
stop(_Pid) ->
stop().
stop() ->
gen_server:cast(?MODULE, stop).
handle_call({method_name_and_params}, _From, State) ->
Response= ok,
{reply, Response, State};
handle_call(_Message, _From, State) ->
{reply, error, State}.
handle_cast(_Message, State) -> {noreply, State}.
handle_info(_Message, State) -> {noreply, State}.
terminate(_Reason, _State) -> ok.
code_change(_OldVersion, State, _Extra) -> {ok, State}.
This is the template, i know a bit long i tried to cut out all comments, euint etc and narrow it to minimum. I posted it to show how much you can save :). Now you will need the transformation file. All this things in { { something } } will be replaced by things we will type in command line or defaults from template file.
For me it looks like this, we have few default definitions and at the bottom. template! This is important part it says which file he has to copy where and what will be the name of new file. Now everything should be clear!
Injet it into rebar!
All you need now to do is symlink your template folders to ~/.rebar/templates and you can use them!
(you can symlink your folder or just create one there :) )
When i was looking at this post i saw that { { is converted in a wrong way by octopress so i added spaces between them! check repo for correct code!
My own templates
Today i started adding my own templates initially i have only gen_server and webmachine_resource but i will add more :). It is fun it is like building your own anti-boiler plate framework.
I’m using new relic for monitoring of my servers and i think it is great tool but recently their marketing wing
just spammmmmmed, hammered me with emails and when i replied to one of this emails it started to be even worse. My name is not Dave, David, John or Mark its Jakub. So i thought about processing nginx and all application servers on my own just for the lulz :).
First thing to achieve was to build a tool that will be able to work with logs in a way that will not limit it usage to small files. This is simple, i decided to build a lib that will enable to stream process logs by “entry”. Some of the
Initial idea
We will emit single “entry” so for logs that are build around “lines” we will emit line and for logs like rails we will emit whole entry. Line streaming is easy so few hours ago during watching 3 season of http://en.wikipedia.org/wiki/Metalocalypse i wrote this.
Future toons
Version 0.0.1 https://github.com/JakubOboza/future_toons This is my entry point to analytics on big files :). First thing was a benchmark. I said to my self that it can’t be more then 0.5sec on my box (i5 MBP) vs 100mb log file from nginx. I took a log from production and checked my initial code. It was 0.37 sec. That’s ok.
lib/toons.js
12345678910111213141516171819
functionFutureToons(filename,callback,end_callback){// in case someone will call it wrong way ;)if(false===(thisinstanceofFutureToons)){returnnewFutureToons(filename,callback);}events.EventEmitter.call(this);this.onEnd(end_callback);if(filename&&callback){// if both present run instantthis.onLine(callback);this.run(filename);}}// do not put methods between this line and initial definitionutil.inherits(FutureToons,events.EventEmitter);
Code of whole thing is very simple. But while building node.js module there are few things that are useful. First of all you don’t need to implement your own way of inheritance you can use the one from utils. (Remember to put it just after function definition) because it overrides prototype :) you don’t wanna lose your “instance methods” don’t you.
Next nice thing to help users is check of instanceof just at top of “constructor” function. This prevents users from using it wrong. Well its easier to say it enables them to use it wrong and fixes their mistake.
This will produce same output and act in same way.
How to use this ?
It is simple :) You need to know three things, where the file is, what do you want to do with each line and do you need to do something at the end.
basic example:
12345678910111213141516
vartoons=require('future_toons');varon_each_line_callback=function(line){console.log("> "+line);};newtoons("example.txt",on_each_line_callback);// this will auto trigger run and process it but you can delay it like thisvarstreamer=newtoons();streamer.onLine(on_each_line_callback);// you can add function on end! also :)streamer.onEnd(function(){console.log("lol")});// run it naow!streamer.run("example.txt");// you can reuse it for many files if you want ;p
This example will show each line of the file prefixed with “>” symbol and at the end he will print out “lol”. This is most common case in real world :) you need to optimize “lol”. And for now thats the whole api.
Command line interface
Currently i added simple command line interface just to play with it. example usage:
12
λ time node bin/toons -f ~/Downloads/access.log -e "function(line){}" node bin/toons -f ~/Downloads/access.log -e "function(line){}"0.38s user 0.07s system 101% cpu 0.440 total
It is a bit unsafe now so maybe i will remove it soonish.
Db vs File
Some people say they need to put logs into db, i always ask this people “why not just file? this db will have to write it to this file anyway :)”
Summary
if you will get email from new relic sales guy, ignore! don’t reply ever!!! Haha while writing this post i got new email from their sales. Code is on github i hope i will have some time to work on it and it will have rails production.log streaming support soonish.
Everyone who wants to feel safe about his data wants to have some sort of backup :). Redis have a support for replication.
And it is very easy to setup.
Setup!
To setup replica node all you have to do is to add one line slaveOf in config @_@ of your new Redis instance.
Sounds easy :). Lets think about most basic scenario.
Two nodes, master node and slave node. For purpose of this example you can just start redis using redis-server command without by default he will start on port 6379 and this is all we need to know to setup replication.
Configuration of replica node
To configure replica node all we need to do is to create place to store the db eg. mkdir replica_db and choose port eg. 7789. Last thing to do is to create config and point this node to the master. For me it looks like this:
daemonize no
timeout 0
loglevel notice
logfile stdout
databases 16
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir ./replica_dir
slave-serve-stale-data yes
slave-read-only yes
appendonly no
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
lua-time-limit 5000
pidfile /var/run/redis-replica-7789.pid
port 7789
# replication configslaveof 127.0.0.1 6379
slowlog-log-slower-than 10000
slowlog-max-len 1024
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
Here the important thing really is slaveof 127.0.0.1 6379 where we set where is our master. port 7789 important if we are using few redis instances on one box and dir ./replica_dir be sure to not point this to master node db path if you do… you will suffer eternal flame.
Now just start the node redis-server redis-replica.conf and he will start syncing.
Checking if everything works
So by now we should have master running on default port and replica connected to it. Lets connect using redis-cli to master and set some keys eg. set name kuba. Now lets connect to our replica. If you followed the same configuration then me you can simply do ./redis-cli -p 7789 this will prompt you with regular command line interface. Now jsut type get name
redis-replica.conf
12
redis 127.0.0.1:7789> get name
"jakub"
Bang works!
RO
Important information is that one master can have many replicas and each replica is read only! So you can connect to it and read from it if you want / need.
123456789
λ ./redis-cli -p 7789
redis 127.0.0.1:7789> keys *
1)"age" 2)"name"redis 127.0.0.1:7789> get name
"jakub"redis 127.0.0.1:7789> set name "not jakub"(error) READONLY You can't write against a read only slave.
redis 127.0.0.1:7789>
Summary
I never had deadly important data in redis :) But still its worth knowing how to setup this just in case something goes wrong you may want to have replica ready :).
I have spend a bit time to find out how to build modules, where to put dependencies and how to form package.json so i decided to create this post to gather some of this info in one place. This modules are more like ruby gems than parts of language grouping functions also called modules (eg. Erlang). I was recently on conference and i want to post something about module i was working on at airport but before next post i need to add this so i will have something to reference to.
npm
Npm stands for node package manager and it is something like gems in ruby, eggs in python or apt in debian. It lets you search, install and update your node or application modules. You can create node.js modules without npm but if you want to publish your module its better to do it this way. If you have node.js >0.6.10 you should have npm bundled with your instalation if not go to http://npmjs.org/ and follow the instructions. (for normal platforms all you have to do i run http://npmjs.org/install.sh)
scaffold
First thing to do is to initialize our module, to do this we need to create a directory for it and run npm init like this.
Package name: (my_first_module)Description: my first module
Package version: (0.0.0)Project homepage: (none) no-fucking-idea.com
Project git repository: (none)Author name: Jakub Oboza
Author email: (none) jakub.oboza@gmail.com
Author url: (none) no-fucking-idea.com
Main module/entry point: (none)Test command: (none) mocha -R landing lib/my_first_module.js
About to write to /private/tmp/my_first_module/package.json
{"author": "Jakub Oboza <jakub.oboza@gmail.com> (no-fucking-idea.com)",
"name": "my_first_module",
"description": "my first module",
"version": "0.0.0",
"homepage": "no-fucking-idea.com",
"scripts": {"test": "mocha -R landing lib/my_first_module.js"},
"dependencies": {},
"devDependencies": {},
"optionalDependencies": {},
"engines": {"node": "*"}}Is this ok? (yes) yes
This will create for us package.json. Now we have fully functional module and we could stop now…. but
package.json
This file is description of our package. It is in form of json so it should be easy to read and change.
Most of this fields don’t need a lot of description because keys are self explaining but things that we should look at is "scripts" where we defined "test" key. If we will now run npm test he will execute this command that is very useful.
Engines defined on what version of node.js our code will work leaving it to * is a bit hazard. You can set it to something specific if you want.
Important thing! If we will not specify entry point to our module by default it will be looking for index.js so for now lets leave it this way.
tests with mocha
If we want to write reasonable code that we want to rely on we should be doing massive testing. I think mocha is a very good library for this purpose! I strongly suggest installing it with flag -g so it will be accessible in global scope of npm
1
λ npm install -g mocha
Code
Ok but we don’t have any code yet ;/. Yes lets start coding.
Everything passed, that is expected because we don’t have any tests :) Now lets add to package.json one more thing development dependency for should.js it will enable us to use mocha in a bit rspec bdd style. Like this:
package.json
123
"devDependencies":{"should":">= 0.0.0"},
And again npm install -l to get everything installed locally.
First test
Initial mocha test for our module
lib/my_first_module.js
1234567891011121314
functionMyFirstFoo(a,b){}module.exports=MyFirstFoorequire('should')describe('MyFirstFoo',function(){it("should be able to add",function(){MyFirstFoo(2,3).should.be[5];});});
First we define empty body of function next we have module.exports = this is to mark which functions will be visible outside this module when other clients will require this module. If you need more info about writing specs in mocha please read my earlier post http://no-fucking-idea.com/blog/2012/04/05/testing-handlebars-with-mocha/. Now lets run npm test
1234567891011121314151617181920212223
λ npm test> my_first_module@0.0.0 test /private/tmp/my_first_module
> mocha -R landing lib/my_first_module.js
-----------------------------------------------------------------------------------------------------------------------------------------------
⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅✈
-----------------------------------------------------------------------------------------------------------------------------------------------
✖ 1 of 1 tests failed:
1) MyFirstFoo should be able to add:
TypeError: Cannot read property 'should' of undefined
at Context.<anonymous> (/private/tmp/lol/my_first_module/lib/my_first_module.js:13:21) at Test.run (/opt/local/lib/node_modules/mocha/lib/runnable.js:156:32) at Runner.runTest (/opt/local/lib/node_modules/mocha/lib/runner.js:272:10) at /opt/local/lib/node_modules/mocha/lib/runner.js:316:12
at next (/opt/local/lib/node_modules/mocha/lib/runner.js:199:14) at /opt/local/lib/node_modules/mocha/lib/runner.js:208:7
at next (/opt/local/lib/node_modules/mocha/lib/runner.js:157:23) at Array.0 (/opt/local/lib/node_modules/mocha/lib/runner.js:176:5) at EventEmitter._tickCallback (node.js:192:40)...
Fails as expected so lets ad implementation to our function.
lib/my_first_module.js
1234567891011121314
functionMyFirstFoo(a,b){returna+b;}module.exports=MyFirstFoorequire('should')describe('MyFirstFoo',function(){it("should be able to add",function(){MyFirstFoo(2,3).should.be[5];});});
We have landed safely ;) Now we are ready for development!
Summary
Creating good quality of code requires testing in node.js thats why i decided to join this two things and explain how to marry them both fast. More info can be found here http://howtonode.org/how-to-module.
redis cluster in currently unstable, i used todays master HEAD (93a74949d7bb5d0c4115d1bf45f856c368badf31) commit to build my redis server and client. Setting redis cluster requires only few settings to go! :)
Regular nodes can’t be part of cluster :( so you have to prepare separate redis configs for your cluster servers.
Most important thing is to setup cluster-enabled and cluster-config-file I decided to name my config files redis-cluster-<port>.conf. I used ports 4444, 4445 4446
Here is my sample config
For each node i created directory cluster_<port> and that was the hardest part actually to do. With this all you have to do is to start ( for debug you can set daemonize to no) all nodes using redis-server path/to/redis-cluster-<port>.conf and then use magic ruby tool :)
redis-tribe.rb
In src/ directory of source you can find ruby script for creating and managing cluster. But first you need to have ruby installed with redis gem. i just did gem install redis but if you don’t have ruby you have to google how to install it etc (hint: get 1.9.2).
now you can run the script. ./redis-tribe.rb and see
Using this tool you can also reshard :D I did on my 15 keys worked :-F.
Smart clients
In redis doc we can read that you will require “smart client” to make it low latency. Yes, you can read from output that it was moved so you will have to cache where the key is now and reset temp cache when it will be moved (resharding)
Fire!
You can now test how it will behaves under fire by killing and restarting your nodes eg.
12345678910111213141516
[19008] 16 Apr 19:44:09.945 # Server started, Redis version 2.9.7[19008] 16 Apr 19:44:09.946 * The server is now ready to accept connections on port 4444
[19008] 16 Apr 19:45:14.414 * Connecting with Node c20290a7b70a2a840a168c3309f00e3de1b1844d at 127.0.0.1:14446
[19008] 16 Apr 19:45:15.424 * Connecting with Node ab93647957ed4bb93fc43b1dc76202a6cdb94f49 at 127.0.0.1:14445
[19008] 16 Apr 19:59:10.047 * 1 changes in 900 seconds. Saving...
[19008] 16 Apr 19:59:10.047 * Background saving started by pid 19321
[19321] 16 Apr 19:59:10.080 * DB saved on disk
[19008] 16 Apr 19:59:10.248 * Background saving terminated with success
[19008] 16 Apr 20:08:29.837 * I/O error reading from node link: connection closed
[19008] 16 Apr 20:08:29.837 * I/O error reading from node link: connection closed
[19008] 16 Apr 20:08:30.056 * Connecting with Node 76d06b0d3cb1b3829cb60574260dff2d06964cea at 127.0.0.1:14446
[19008] 16 Apr 20:08:30.056 * I/O error writing to node link: Broken pipe
[19008] 16 Apr 20:08:30.525 * I/O error reading from node link: connection closed
[19008] 16 Apr 20:08:30.526 * I/O error reading from node link: connection closed
[19008] 16 Apr 20:08:31.063 * Connecting with Node 35d107017bc726ece9b57e1ea2f21678555cf6a8 at 127.0.0.1:14445
[19008] 16 Apr 20:08:31.064 * Connecting with Node 76d06b0d3cb1b3829cb60574260dff2d06964cea at 127.0.0.1:14446
Summary
Even if i think this is a great tool, and is unstable i saw after few minutes play that some things just don’t work as intended and some keys are not pushed. But it is pulled from unstable branch so i’m crossing my fingers for this project because it looks sweet! Go Go Antirez.