Late spring term, 2014, Moodle would periodically reach a load level that made it effectively unusable. During the early part of summer term, the OS load levels would go from a reasonable level of 2 and would spike at 50 or 60. During these high load events, both CPU utilization and memory utilization would spike, along with disk caching. From a user perspective the system was down.
In the early part of the diagnostics, Munin, a systems monitoring tool, collected detailed systems performance data from both the Moodle server and the MySQL back-end database. It showed that there was memory paging occuring between RAM and disk. After identifying the paging issue, the system admin increased memory(from 16G to 64G), CPUs(from 2 to 6) and increased the buffer size of the MySQL server. This fixed the paging issue, but the same performance symptoms were observed.
Something drastic needed to occur to determine the cause of the underlying performance issue. Moodle was split into two instances, one running fall, winter and spring, while the second ran summer only. Performance issues were still observed. At this point, a focus was placed on large embedded video files and large course backups, both of which caused excessively large load levels. The large embedded video files were identified as a possible cause, moved out of Moodle and replaced with a link to the videos. At the same time a memory leak was identified within Flowplayer. Flowplayer was updated to the version used in Moodle 2.7 and the system admin put a cron job in place that frees unused memory every 3 – 5 minutes. At this point, CPU, load average, memory and network utilization were greatly reduced.
What is the current status of Moodle?
- Moodle 2.5, with upgraded Flowplayer has not performed poorly since week 28, 2014
- Moodle 2.7 in currently being configured for fall term
- The memory cleanup script will remain in place for Moodle 2.7
- Load testing will be performed, utilizing LoadStorm.com and results will be posted here
- The location of large video files will be dependent upon the results of load testing
- Test performance utilizing SSD system drives
August 8, 2014 Update
- Moodle 2.7, the lastest version, is available for Fall term course development. It is located at https://ginger.wou.edu — its final name will be https://moodle.wou.edu
- All Moodle data on 2.7 have been moved to SSD drives (solid state disk)
- A performance test was run, utilizing spinning disk followed by SSD drives. The graph can be seen here: (the large double spikes represent a 580 user load test utilizing spinning drives. The small bump in the afternoon is a 5,125 user load test utilizing solid state drives.) What this means to you is, the system can handle 10 times the users at 1/20 the processing time.
- A second Moodle web server was added, doubling the capacity. Testing will commence shortly.
- online3.wou.edu and online2.wou.edu are meant for archival purposes only. New courses must be developed on moodle.wou.edu For help exporting from old versions of Moodle into the 2.7 version of Moodle, call Elayne Kuletz.
August 28, 2014
Today, there are 4 web servers residing behind the load balancer. The load balancer hands out initial web-requests utilizing a round-robin algorithm. This means that if there are a total of 100 active users utilizing Moodle, each web-server will be handling 25 users. To access the load-balanced Moodle, go to https://moodle.wou.edu Each web-server can handle about 410 users and still maintain a good user experience. This will give us a capacity of 1,240 concurrent users with a good user experience. This past academic year, 250 concurrent users were the maximum number of users that were observed. All load testing was performed while playing large videos. As our load grows, additional web servers can be easily cloned from the existing web servers and placed behind the load balancer. By the end of next week, we will have a consultant available to provide systems administration support.
The load-balanced schematic can be found here.
Final hardware configuration:
- NetApp EF550 solid state storage array
- Cisco UCS server platform
- Cisco ACE 4710 load balancers
- 10Gb network infrastructure at the core
- Four load balanced web servers running on RedHat
- One database server running on RedHat
- Moodle version 2.7.1
Best / Worst Analysis
- What is the best outcome that can occur, continuing with Moodle?
- What is the worst outcome that can occur, continuing with Moodle?
- What is the best outcome than can occur, by not continuing with Moodle?
- What is the worst outcome that can occur, by not continuing with Moodle?