1 streaming API + 11 workers + 4 special scripts = 150 Million Tweets Saved


This afternoon we passed a major milestone, TwapperKeeper has now saved over 150 MILLION TWEETS.

And other than the disk space issue we have now, the system is running relatively smoothly and keeping up with the throughput of tweets, which often passes 150 tweets / second.

The only major issue we are concerned about is how fast @person archives fill up after first being created – and we will probably introduce a little more advanced logic to our archiving routines so users see those archives fill quickly after creation.

If anyone has any questions, don’t hesitate to ask.

And if you wonder how we are going to pay for all this extra needed disk space, get in line.  We aren’t sure just yet either 😉  A P2P storage concept would be pretty nice about right now… 🙂



One Response to “1 streaming API + 11 workers + 4 special scripts = 150 Million Tweets Saved”

  1. twapperkeeper Says:

    Oh wait – there is one issue we have seen – there does seem to be numerous tweets that don’t have a timestamp / create date on them. We are not 100% sure if this got lost in the migration or was already there prior.

    However, we can usually fix the issue, so if you see an archive that needs to be fixed let us know. We are also looking across the database to proactively correct, but if you need it fixed faster, just let us know! – @jobrieniii

Comments are closed.

%d bloggers like this: