Performance and Scalability Guidelines

Revision as of 10:34, 1 July 2014 by Dweuthen (talk | contribs)

To ensure continuous worry-free operation of MailStore Instances, the impact of infrastructure components such as network, storage and other hardware as well as the influence of different configuration options on the overall performance need to be understood.

Please understand that this document can only explain basic coherences without providing explicit instructions applicable for every possible setup. If uncertain, please do not hesitate to contact our technical support for further questions.

The observations described in this document are solely focusing on the requirements of an individual MailStore Instance running on an Instance Host of the MailStore Service Provider Edition. Thus servers in possession of the Instance Host role are affected most by the recommendations in this document. These recommendation are based on an average yearly email volume of 10,000 emails per user. The maximum recommended number of users per Instance Host should not exceed 1,000 users with a maximum of 300 users for each individual MailStore Instance.

Hardware Sizing

Providing suitable hardware is the most important prerequisite for high-performance operation as well as persistent scalability of MailStore Instances. Below, further details are provided on how hardware components help to ensure the level performance required to satisfy customers needs.

Processor (CPU)

High CPU usage occurs during email archiving when bodies and attachments are indexed. With an increasing number of concurrent operations and/or instances, the number of CPU cores has to be raised accordingly.

Keep in mind that an improper configuration of virtual CPU sockets and cores may lead to a decreased guest performance if virtualization technologies are used. Please consult the documentation of your virtualization solution for further details about the socket-to-core ratio of physical and virtual CPUs recommended for running multiple multi-threaded processes inside virtual machines.

Main Memory (RAM)

Along with the size of the archive, the average memory requirement is slightly increased when archiving new email, browsing the archive or performing searches. Additional main memory is temporarily required for every email to be archived or when a search returns a large number of hits. Therefore it is recommended to leave at least 20% of main memory unused.

To calculate the minimum memory requirements use the following simple formula:

   1024 MB + INSTANCES × 256 MB + USERS × 5 MB + 20%

Storage

MailStore Instances are very I/O intensive processes and due to their storage technology comparable with database servers. For that reason the used storage has a massive impact on the performance of MailStore Instances, affecting all areas from archiving email to end user access. The following topics should receive special attention:

Connectivity

Generally, direct attached storage (DAS) is preferred due to its low latency. When using network attached storage (NAS), the minimum required disk throughput for each instance is 5 megabyte per second. Thus it is recommended to establish a dedicated 1 Gbit/s link between each Instance Host and the NAS.

Allocation

It is recommended to allocate dedicated disk arrays for each Instance Host to minimize the risk of negative influence caused by other I/O intensive systems such as database or email servers. Additionally, different disk arrays for system (operating system, program and temporary files) and other data should be used.

Hard Disks & RAID Level

Because the overall performance mainly depends on the maximum number of random disk I/O operations per second (also known as random disk IOPS), RAID arrays of level 1 or 10 that consist of multiple smaller disks are highly recommended.

Due to the nature of RAID levels 5, 50, 6 or 60, which on average do not offer any write speed gain, their usage would lead to decreased overall performance and thus cannot be recommended for disk I/O intensive applications such running MailStore Instances.

For setups with many small instances or fewer large instances but with a high email volume the use of fast (>=10.000 upm) SAS drives is highly recommended.

Adding solid state disks (SSD) for read-write caching or even offloading frequently requested data can result in extra performance, allowing a higher number of concurrent archiving threads or an increased maximum number of users per Instance Host.

If uncertain about the optimal configuration of your storage, please consult the storage system's documentation or vendor support to find out about the recommend configuration for disk I/O intensive database applications.

Archiving Profiles

The choice of the archiving strategy may also have an impact on overall performance. Therefore it is important to know how the different archiving profiles provided by MailStore influence the usage of system resources.

Archiving Individual Mailboxes

Obviously, with every running single mailbox or multiple mailbox archiving profile, the workload on email servers and Instance Hosts is increased.

Synchronizing the mailbox content with already archived emails primarily puts disk I/O load on the underlying storage, whereas archiving new emails also consumes CPU time and main memory for the purpose of indexing.

To minimize system load and limit the overall execution time of archiving profiles, no more than 5000 emails should remain in the archived mailboxes. This can be achieved by enabling email deletion by MailStore, e.g. delete emails older than a certain age from the mailboxes. Furthermore, the number of concurrent archiving threads should be kept as low as possible (i.e. 100 per Instance Host).

The workload created by these types of archiving profiles has to be considered as high, therefore their usage can only be recommended for environments with small mailboxes and a fairly low email volume.

Archiving Journal or Multidrop Mailboxes

When archiving emails from journal or multidrop mailboxes, emails are processed sequentially. Following the recommendation to let MailStore delete emails from such mailboxes after they have been archived successfully, no synchronization overhead exists, contrary to archiving individual mailboxes. Therefore system resources like disk, processor and main memory are only impacted when new emails are archived.

For that reason these archiving profiles are suitable for any environment, irrespective of mailbox sizes or email volume within given limits.

Archiving Email Clients or Files

When archiving via email clients or directly from email files the details in Archiving Individual Mailboxes apply to this kind of archiving profiles as well.

Additionally, the execution of client side archiving profiles cannot be controlled from the within the MailStore Instance, which increases the risk of too many concurrent archiving threads putting an unexpected high workload on the Instance Hosts.

Therefore client side archiving profiles are suitable for one-time archiving of old emails stored locally on computers as well as for archiving email continuously in environments with a very small number of users and/or low email volume.

Other Activities

Searching

The impact of search operations depends on their scope, the number of archive stores and found items. The more items are found, the more main memory is required for storing the search result, which means that searching across all user archives typically consumes much more main memory than searching in a single user archive. Also, every full text index file in every archive store must be accessed when a search across all users is performed, resulting in a high number of random reads from the storage. Please keep in mind that it is very common that searches across all user archives containing several tens of thousands of emails can take several minutes.