What AWS drive options should I use for my EC2 instance?

I created a new instance of Ubuntu c3.xlarge, and when I get the storage options, I get the opportunity to change the ROOT to a general-purpose SSD, Provisioned IOPS or magnetic, also if I choose Provisioned IOPS, I can set a different value. The additional data store in Instance Store 0 has no parameters, but if you change EBS, then I have the same parameters.

I am really trying to understand:

  • The speed of each option
  • the cost of each option

Amazon documentation is very fuzzy

I use this instance to transfer data from text files to a Postgres relational database, these files must be processed in turn with several INSERT statements per line, so it’s slower on my local computer (5 million lines of data takes 15 hours). Initially, the database was separate in RDS, but it was incredibly slow, so I installed the local database on the instance itself, removing network latency, which made things a little faster, but it is still significantly slower than my local modest Linux server.

A look at the instance logs when loading an instance of the data CPU is only 6%, so now you might think that the disk may limit the coefficient. The database will use / (not sure if it is SSD or magnetic - how can I find out), and the data files are on the / mnt drive (using the Store 0 instance).

I only need this instance to do two things:

  • Loading a database from data files
  • Create Lucene Search Index from Database

(therefore, the database is just an intermediate step)

The search index is transferred to the EBean server, and then I don’t need this instance for another month, when I repeat the process with the new data, therefore, keeping in mind, I can afford to spend more money on faster processing, because I "I I will use only one day a month, then I can stop the instance and not incur any further costs?

Please, what can I do to identify the problem and speed up the process?

+7
amazon-web-services amazon-ec2
source share
2 answers

Here is my personal recommendation:

  • If the volume is low (<33G) and only requires a final burst of performance, such as a boot volume, use magnetic disks.

  • If you need predictable performance and high throughput, use PIOPS volumes and optimized EBS instances.

  • Otherwise, use a general purpose SSD.

+6
source share

Your processor is only 6%, maybe you can try using a multiprocessor?

Have you tested the I / O performance of the remote instance volumes?

PIOPS is expensive, but it's not much better than gp2, the only advantage is stable .

For example, I create 500G gp2 and 500G PIOPS with 1500IOPS, then I try to insert and find 1,000,000 documents by mongodb, then I will check io performanace like mongoperf / iostat / mongostat / dstat

Each iops performance volume is expected to be up to 1500, but gp2 iops is unstable, almost from 700 to 1600 (r + w), if you read only, it can compress to 4000, if you only write, it reaches 800. piops is ideally stable, iops is almost 1470 .

In your situation, I suggest considering the issue of gp2 (the size of the volume depends on your requirement iops, 500G gp2 = 1500iops, 1T gp2 = 3000iops (maximum))

+4
source share

All Articles