StackOverflow - Kiến trúc của những trang chịu tải lớn
https://www.dev-metal.com/architecture-stackoverflow/
Các lần tái cấu trúc của trang StackOverflow qua đó ta thấy được phần nào kiến trúc giúp cho trang này có thể chịu được lượng tải lên đến hàng trăm triệu visit một tháng như vậy.
3 nguyên lý chính trong kiến trúc của StackOverflow
-
Performance is a feature – hiệu năng là một tính năng chính
-
Cache all the thing – cache mọi thứ có thể vì đây là một trang hỏi đáp nên số lượng đọc rất lớn, hơn ghi rất nhiều
-
Reinvention is OK – bằng việc cải tiến lại nhiều công nghệ cốt lõi để đảm bảo performance hệ thống các nhà sáng lập đã tạo ra rất nhiều library có performance cực cao như StackExchange.Redis, Dapper, Jil…
Với những nguyên lý như vậy StackOverflow đã trải qua 2 lần tái cấu trúc với rất nhiều công nghệ được áp dụng nhằm cải tiến hiệu năng sản phẩm như:
-
Dùng NoSQL Redis làm cache…
-
Dùng HAProxy làm Load Balancer
-
Dùng Lucene.NET, Elastic Search làm search engine
Which tools and technologies are used to build the Stack Exchange Network?
Stack Overflow uses a WISC stack:
- Operating System Microsoft Windows Server 2012 R2 x64
- Web Server IIS 8.5
- Database SQL Server 2014 CTP2 running Microsoft Windows Server 2012 R1 x64
- Language C#
Software Development Tools
- IDE Visual Studio 2015
- Framework Microsoft ASP.NET (version 4.0) on .NET 4.5.2
- Web Framework ASP.NET MVC 5 with MiniProfiler
- View Engine Razor 3
- Browser Framework jQuery 1.12.4
- Data Access Layer LINQ to SQL and Dapper
- Cache / Additional Data redis 2.8.19 via StackExchange.Redis, with serialization via protobuf-net
- Source Control Git using a self-hosted GitLab instance (previously Mercurial from 2010–2014, Subversion from 2008–2010)
- Compare Tool Beyond Compare 4
External Bits
Code used in Stack Overflow that is not included as part of the development tools:
- reCAPTCHA
- DotNetOpenAuth
- WMD - Now developed as open source in the project PageDown
- Prettify
- Google Analytics
- TeamCity
- HAProxy
- Cacti
- MarkdownSharp
- GitLab
- Orion
- LESS (source)
- MathJax
- Elasticsearch (source)
Miscellaneous
- WordPress on Linux (Site Blogs) Jekyll (on linux?) (for blog.stackexchange.com)
- WebSocket (for real time updates; custom C# implementation)
- Bandwidth used by Stack Exchange sites
- jQuery Isotope plugin (for the grid-style site list) (Source)
Content
- License Creative Commons Attribution-Share Alike 3.0 Generic
- Standards OpenSearch, Atom
- Host three datacenters:
- New York: QTS (technically in Jersey City, NJ now). Formerly hosted at Internap and PEER 1.
- Denver: FORTRUST
- Oregon: PEAK Internet
Hardware
- 11 Dell R630 IIS web servers (9 shared for all production like SO, two for Meta and development):
- 2x Intel Xeon Processor E5-2690 v3 @ 2.6 GHz 12 Core with 24 threads
- 64 GB RAM
- Windows Server 2012 R2
- Two drives
- 2 Intel 320 300GB SSD (RAID 1)
- 2x 10Gb network team
- Three Dell R720xd database servers (two in New York City, one in Denver, using SQL AlwaysOn Clustering) (Global "Sites" DB & Stack Overflow dedicated):
- 2x Intel Xeon Processor E5-2680 @ 2.7 GHz
- 384 GB RAM
- 21 drives
- Mirrored Pair for OS
- 2 Intel P3700 2TB PCIe NVMe RAID1 for databases
- 24 Intel 710 200GB SSD RAID10 for databases
- SQL Server 2014 SP1
- 2x 10Gb network team
- Three Dell R730 database servers (two in New York City, one in Denver, using SQL AlwaysOn Clustering) (All other sites, Careers, Area 51, etc.):
- 2x Intel Xeon Processor X5680 @ 3.33 GHz
- 768 GB RAM
- 28 drives
- Mirrored Pair for OS
- 2 Intel P3700 2TB PCIe NVMe RAID0 for databases
- 24 1.2TB 10K RAID10 for large databases
- SQL Server 2014 SP1
- 2x 10Gb network team
- Two Dell R620 HAProxy servers (direct):
- 2x Intel Xeon Processor E5-2650 @ 2.0 GHz
- 64 GB RAM
- CentOS 7
- 2x 10Gb network team (internal)
- 2x 10Gb network team (external)
- Two Dell R620 HAProxy servers (CloudFlare):
- 2x Intel Xeon Processor E5-2637v2 @ 3.5 GHz
- 192 GB RAM
- CentOS 7
- 2x 10Gb network team (internal)
- 2x 10Gb network team (external)
- Two Dell R630 Railgun servers (CloudFlare):
- 2x Intel Xeon Processor E5-2699v3 @ 3.5 GHz
- 192 GB RAM
- CentOS 7
- 2x 10Gb network team (internal)
- 2x 10Gb network team (external)
- 2 Dell R630 Redis servers:
- 2x Intel Xeon Processor E5-2687Wv3 @ 3.1 GHz
- 128 GB RAM
- CentOS 7
- 2x 10Gb network team
- Three Dell R630 Service servers for tag engine/search:
- 2x Intel Xeon Processor E5-2643v3 @ 3.4 GHz
- 64 GB RAM
- One Dell R620 Backup server running NetBackup (most backups):
- 2x Intel Xeon Processor E52620 @ 2.0 GHz
- 16 GB RAM
- 14 drives
- Mirrored Pair for OS
- 12 4TB 10K RPM RAID10 for backups (DAS)
- 2x 10Gb network team
- One Dell R730xd SMB3 Backup server (SQL backups):
- 2x Intel Xeon Processor E5-2623v3 @ 3.0 GHz
- 16 GB RAM
- 30 drives
- Mirrored Pair for OS
- 16 6TB 7.2K RPM RAID10 for backups (Internal)
- 12 4TB 10K RPM RAID10 for backups (DAS)
- 2x 10Gb network team
- Four Dell FC630s VMWare ESX(in Two FX2s Chassis):
- 2x Intel Xeon Processor E5-2698 v2 @ 2.30 GHz
- 768 GB RAM
- 16x 10Gb network team (8x 10Gb per FX2s)
- 2 Cisco ASR1001-X routers
- 2 ASR1001 Routers
- 2 Fortinet 800C Firewalls
- 2 Cisco Nexus 5596 Cores in an active/active redundant configuration
Sources:
- Stack Overflow's New York Data Center (Server Fault Blog)
- Designing for Scalability of Management and Fault Tolerance (Server Fault Blog)
- What Was Stack Overflow Built With?
- Stack Overflow Server Glamour Shots
- Technology and SEO profile for stackoverflow.com
- Stack Overflow and DVCS
- Stack Overflow Network Configuration