Dive deep into the AWS DataSync, such as availability and deployment options. Learn how Data Migration with AWS DataSync can fit into an organization's cloud strategy by comparing it with other similar AWS services.
The major cloud providers such as AWS, Azure, and Google are convincing organizations to move its existing infrastructure and applications to the cloud. Hence, organizations need a service that can migrate the data to cloud infrastructure smoothly with fewer hiccups.
AWS has introduced another new service to migrate your existing infrastructure and applications to the AWS cloud known as DataSync. DataSync accelerates the network data transfers between your on-premise data center to the AWS cloud. According to AWS, DataSync copies data up to 10 times faster than the open source tools to copy data over Direct Connect or AWS VPN tunnel, such as unison and rsync.
Data Migration with AWS DataSync and Features
DataSync works through a VM agent that is mounted as an NFS on a local server. The agent establishes a secure TLS connection with the DataSync service, then gains access to Amazon Elastic File System (EFS) or S3 buckets.
A proprietary data transfer protocol is used by AWS DataSync to accelerate the data movement over the WAN with incremental transfers of changed files and inline compression with sparse file detection. DataSync also performs encryption and data transfer validation.
The established connection between the on-premises agent and AWS are horizontally scalable and multithreaded. Additional agents can be added to increase the throughput and maximize utilization of up to 10 GBPS network links. One can also cap the bandwidth to avoid degraded network performance for other applications.
When copying data into S3, every file is converted into an object, and file metadata is stored as S3 object metadata. When copying data to EFS, DataSync mirrors the complete directory structure of the NFS and stores file system metadata with EFS. DataSync uses TLS encryption during transmission and writes encrypted data to EFS or S3.
Users can monitor, log and audit usage of AWS DataSync with CloudWatch and CloudTrail.
Deploy AWS DataSync in few steps:
- Login to AWS Console and set up the DataSync agent for either on-premise to AWS transfer or vice versa.
- Download the agent software from AWS Console in the form of open virtualization archive image. The specs to install the image in your data center requires a VM with four virtual CPUs, minimum 32 GB of RAM and 80 GB free disk space. Once the installation is complete, mount the agent via NFS on systems with data that you wish to transfer to AWS.
- In AWS Console, create a task for data transfer that specifies the data source, destination, and any required options.
The DataSync tasks that are created can be used to run via AWS console or in the command line interface (CLI). After the initial copy of the data for each run, Datasync scans the source and destination for any changes and copies the differences.
AWS DataSync compared with other Amazon services
Data migration with AWS DataSyc is not just to migrate active applications data to EFF files or S3 objects. It can be useful in other scenarios as well.
- Disaster Recovery or replicate data for long term archival
- Copying data from your on-premises data center into the AWS cloud for processing by AWS machine learning analytics service or AI.
- Use S3 as an auxiliary storage location with data accessed by the on-premise applications via Storage Gateway.
Comparing DataSync with other similar AWS Services
For massive data, Snowball Edge can be utilized if it's a one-time data copy for users with limited bandwidth. DataSync suits well for data set synchronization between environments that keeps changing.
Storage Gateway is an AWS service that complements Data Sync to provide a real-time, low-latency connection between the two environments. DataSync can seed data set into AWS, where Storage Gateway can provide transparent access for on-premises users and applications.
Comparing S3 with DataSync, S3 has a similar feature - S3 Transfer Acceleration, but it is designed for applications utilizing the S3 API. S3 Transfer Acceleration loads a large number of data from remote clients to S3 bucket with global reach by integrating the CloudFront. CloudFront also optimizes the network path between the AWS cloud and remote clients.
Pricing of AWS DataSync
AWS DataSync comes with a flat rate of $0.04 per GB of data transferred. Since DataSync relies on other AWS services that carry their own separate charges, such as EFS, S3, VPN, VPN CloudHub or Direct Connect network connections, and CloudWatch.