This post is about a naive benchmark: NodeJS + Cassandra VS Ruby on Rails + MySQL. There are plenty of benchmarks in the wild(db/web). However, it’s still difficult to decide which stack solution to use for your application. Actually, it heavily depends on the application/use case/scenario. This benchmark is based on a very simple scenario: a click count application. This benchmark is very naive! Be aware of that when you use it as a reference.
这篇帖子写得是关于NodeJS + Cassandra VS Ruby on Rails + MySQL的性能对比。网上其实有很多很多这样的比较(数据库/网站框架)。但是我看了以后还是觉得很难决定用哪个产品来做哪个应用。大家都论调不同,视角不同,比较的人也很难说对所有的产品都精通,难免有偏颇,甚至有的可能有广告嫌疑。反正,性能的对比很大程度上依赖于应用/案例/场景。我的这个简单的比较是基于一个简单的点击计数的案例。(这个比较非常初步,仅供参考,可能以后会更深入的研究一下)
NodeJS + Cassandra VS ROR + MySQL
Setup
- Both use an individual virtual machine (VirtualBox) with 1 core cpu & 2 GB Memory
- Install the stack in their respective vm
- Start the services (NodeJS, Cassandra, Ruby on Rails, MySQL) with no tunning - Use configuration come with default, minium modification to make it up & running
- Granularity is hour
Application Implementation
NodeJS + Cassandra
Stack: Cassandra, node cassandra driver, Express, NodeJS, nvm, npm
DB Schema (CQL):
1
|
|
- Query to insert a hit (CQL):
1
|
|
Application Code: For detailed code, see my Github
Command to start server:
1
|
|
ROR + MySQL
Stack: MySQL, mysql2 gem, Rails, Ruby, rvm, bundler
DB Schema (ActiveRecord Migration):
1 2 3 4 5 6 7 8 9 10 11 12 |
|
- Query to insert a hit (ActiveRecord):
1 2 3 4 5 6 |
|
Application Code: For detailed code, see my Github
Command to start server:
1
|
|
Execution
To generate load, I used my python script. It send requests to a target url with multiple threads. I chose 3 urls associated with three click target ID’s. The rule is simple, push the applications to the limit. If it can handle more, create more threads to generate more load until reach the limit.
Result
Stack | Node+Cas | ROR+MySQL |
---|---|---|
Max RPM | 18,000 | 12,000 |
Response Time | very fast | fast |
NodeJS crashed when RPM goes to around 36,000
Conclusion
Be aware that the data set is not large
NodeJS + Cassandra can handle more requests. I think Cassandra is faster to perform counter increment.
NodeJS + Cassandra response the request faster, because it’s non-blocking. Requests can be responsed while db query is still running in NodeJS.
Ruby on Rails + MySQL is more stable. It didn’t crash when it’s overloaded. Maybe I didn’t implement the NodeJS correctly? Please comment.
NodeJS + Cassandra is much easier to develop for this use case. e.g. Insert & Update are same in cassandra. That means you don’t have to create a new row before the click increment query.
NodeJS + Cassandra retreives data faster, since the data is sorted by timestamp in storage. On the other side, MySQL requires scanning over the entire table and then sort.
Cassandra has a better scalability. By adding Cassandra nodes, performance can be improved in a linear fashion.
Future Study
This benchmark is very naive. The following points can be improved in the future.
Larger data set
Mix & Match: NodeJS + MySQL Ruby on Rails + Cassandra
More DB: MongoDB CouchDB
More web framework: PHP Django
Reading operation performance
Reference
- Cassandra VS MongoDB
- Couchbase VS Cassandra the two links above are really conflicting
- Rails VS NodeJS nodejs的吞吐量貌似没有它说的那么牛,但是他对rails的优势劣势分析得还是挺中肯的
- Web App Framework Benchmarks
- Cassandra
- NodeJS
- Express
- Node Cassandra Driver