|
| 1 | +--- |
| 2 | +title: Erasure Coding |
| 3 | +category: administration |
| 4 | +order: 9 |
| 5 | +--- |
| 6 | + |
| 7 | +With the release of version 3.0.0, Hadoop introduced the use of [Erasure Coding] |
| 8 | +(EC) in HDFS. By default HDFS achieves durability via block replication. |
| 9 | +Usually the replication count is 3, resulting in a storage overhead of 200%. |
| 10 | +Hadoop 3 introduced EC as a better way to achieve durability. EC behaves much |
| 11 | +like RAID 5 or 6...for *k* blocks of data, *m* blocks of parity data are generated, |
| 12 | +from which the original data can be recovered in the event of disk or node |
| 13 | +failures (erasures, in EC parlance). A typical EC scheme is Reed-Solomon 6-3, |
| 14 | +where 6 data blocks produce 3 parity blocks, an overhead of only 50%. In |
| 15 | +addition to doubling the available disk space, RS-6-3 is also more fault |
| 16 | +tolerant...a loss of 3 data blocks can be tolerated, whereas triple replication |
| 17 | +can only sustain a loss of two. |
| 18 | + |
| 19 | +To use EC with Accumulo, it is highly recommended that you first rebuild Hadoop |
| 20 | +with support for Intel's ISA-L library. Instructions for doing this can be found |
| 21 | +[here](https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-hdfs/HDFSErasureCoding.html#Enable_Intel_ISA-L) |
| 22 | + |
| 23 | +### Important Warning |
| 24 | +As noted |
| 25 | +[here](https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-hdfs/HDFSErasureCoding.html#Limitations), |
| 26 | +the current EC implementation does not support `hflush()` and `hsync()`. These |
| 27 | +functions are no-ops, which means that EC coded files are not guaranteed to |
| 28 | +be written to disk after a sync or flush. For this reason, **EC should never |
| 29 | +be used for the Accumulo write-ahead logs. Data loss may, and most likely will, |
| 30 | +occur.** It is also recommended that tables in the `accumulo` namespace (`root` and |
| 31 | +`metadata` for example) continue to use replication. |
| 32 | + |
| 33 | +### EC and Threads |
| 34 | +Due to the striped nature of an EC encoded file, an EC enabled HDFS client is threaded. |
| 35 | +This becomes an issue when an Accumulo client or service is configured to use multiple |
| 36 | +threads to read or write to HDFS, and becomes especially problematic when doing bulk |
| 37 | +imports. By default, Accumulo will use eight times the number of cores on the client |
| 38 | +machine to scan the files to be imported and map them to tablet files. Each thread |
| 39 | +created to scan the input files will create on the order of *k* threads to perform |
| 40 | +parallel I/O. RS-10-4 on a 16 core machine, for instance, will spawn over a thousand |
| 41 | +threads to perform this operation. If sufficient memory is not available, this operation |
| 42 | +will fail without providing a meaningful error message to the user. This particular |
| 43 | +problem can be ameliorated by setting the `bulk.threads` client property to `1C` (i.e. |
| 44 | +one thread per core), down from the default of `8C`. Similar care should be taken |
| 45 | +when setting other thread limits. |
| 46 | + |
| 47 | +### HDFS ec Command |
| 48 | +Encoding policy in HDFS is set at the directory level, with children inheriting |
| 49 | +policies from their parents if not explicitly set. The encoding policy for a directory |
| 50 | +can be manipulated via the `hdfs ec` command, documented |
| 51 | +[here](https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-hdfs/HDFSErasureCoding.html#Administrative_commands). |
| 52 | + |
| 53 | +The first step is to determine which policies are configured for your HDFS instance. |
| 54 | +This is done via the `-listPolicies` command. The following listing shows that there |
| 55 | +are 5 configured policies, of which only 3 (RS-10-4-1024k, RS-6-3-1024k, and RS-6-3-64k) |
| 56 | +are enabled for use. |
| 57 | + |
| 58 | +``` |
| 59 | +$ hdfs ec -listPolicies |
| 60 | +Erasure Coding Policies: |
| 61 | +ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5], State=ENABLED |
| 62 | +ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1], State=ENABLED |
| 63 | +ErasureCodingPolicy=[Name=RS-6-3-64k, Schema=[ECSchema=[Codec=rs, numDataUnits=6, numParityUnits=3, options=]], CellSize=65536, Id=65], State=ENABLED |
| 64 | +ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=3], State=DISABLED |
| 65 | +ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4], State=DISABLED |
| 66 | +``` |
| 67 | + |
| 68 | +To set the encoding policy for a directory, use the `-setPolicy` command. |
| 69 | + |
| 70 | +``` |
| 71 | +$ hadoop fs -mkdir foo |
| 72 | +$ hdfs ec -setPolicy -policy RS-6-3-64k -path foo |
| 73 | +Set RS-6-3-64k erasure coding policy on foo |
| 74 | +``` |
| 75 | + |
| 76 | +To get the encoding policy for a directory, use the `-getPolicy` command. |
| 77 | + |
| 78 | +``` |
| 79 | +$ hdfs ec -getPolicy -path foo |
| 80 | +RS-6-3-64k |
| 81 | +``` |
| 82 | + |
| 83 | +New directories created under `foo` will inherit the EC policy. |
| 84 | + |
| 85 | +``` |
| 86 | +$ hadoop fs -mkdir foo/bar |
| 87 | +$ hdfs ec -getPolicy -path foo/bar |
| 88 | +RS-6-3-64k |
| 89 | +``` |
| 90 | + |
| 91 | +And changing the policy for a parent will also change its children. The `-setPolicy` |
| 92 | +command here issues a warning that existing files will not be converted. To |
| 93 | +switch the policy for an existing file, you must create a new file (through |
| 94 | +a copy, for instance). For Accumulo, if you change the encoding policy for |
| 95 | +a table's directories, you would then have to perform a major compaction on |
| 96 | +the table to convert the table's RFiles to the desired encoding. |
| 97 | + |
| 98 | +``` |
| 99 | +$ hdfs ec -setPolicy -policy RS-6-3-1024k -path foo |
| 100 | +Set RS-6-3-1024k erasure coding policy on foo |
| 101 | +Warning: setting erasure coding policy on a non-empty directory will not automatically convert existing files to RS-6-3-1024k erasure coding policy |
| 102 | +$ hdfs ec -getPolicy -path foo |
| 103 | +RS-6-3-1024k |
| 104 | +$ hdfs ec -getPolicy -path foo/bar |
| 105 | +RS-6-3-1024k |
| 106 | +``` |
| 107 | + |
| 108 | +### Configuring EC for a New Instance |
| 109 | +If you wish to create a new instance with a single encoding policy for all tables, |
| 110 | +you simply need to change the encoding policy on the `tables` directory after |
| 111 | +running `accumulo init` (see |
| 112 | +[Quick Start]({% durl getting-started/quickstart#initialization %}) guide). To |
| 113 | +keep the tables in the `accumulo` namespace using replication, you |
| 114 | +would then need to manually change them back to using replication. Assuming |
| 115 | +Accumulo is configured to use `/accumulo` as its root, you would do the following: |
| 116 | + |
| 117 | +``` |
| 118 | +$ hdfs ec -setPolicy -policy RS-6-3-64k -path /accumulo/tables |
| 119 | +Set RS-6-3-64k erasure coding policy on /accumulo/tables |
| 120 | +$ hdfs ec -setPolicy -replicate -path /accumulo/tables/\!0 |
| 121 | +Set replication erasure coding policy on /accumulo/tables/!0 |
| 122 | +$ hdfs ec -setPolicy -replicate -path /accumulo/tables/+r |
| 123 | +Set replication erasure coding policy on /accumulo/tables/+r |
| 124 | +$ hdfs ec -setPolicy -replicate -path /accumulo/tables/+rep |
| 125 | +Set replication erasure coding policy on /accumulo/tables/+rep |
| 126 | +``` |
| 127 | + |
| 128 | +Check that the policies are set correctly: |
| 129 | + |
| 130 | +``` |
| 131 | +$ hdfs ec -getPolicy -path /accumulo/tables |
| 132 | +RS-6-3-64k |
| 133 | +$ hdfs ec -getPolicy -path /accumulo/tables/\!0 |
| 134 | +The erasure coding policy of /accumulo/tables/!0 is unspecified |
| 135 | +``` |
| 136 | + |
| 137 | +Any directories subsequently created under `/accumulo/tables` will |
| 138 | +be erasure coded. |
| 139 | + |
| 140 | +### Configuring EC for an Existing Instance |
| 141 | +For an existing installation, the instructions are the same, but with the |
| 142 | +caveat that changing the encoding policy for an existing directory will not |
| 143 | +change the files within the directory. Converting existing tables to EC |
| 144 | +requires a major compaction to complete the process. For instance, to |
| 145 | +convert `test.table1` to RS-6-3-64k, you would first find the table ID |
| 146 | +via the accumulo shell, use `hdfs ec` to change the encoding for the |
| 147 | +directory `/accumulo/tables/<tableID>`, and then compact the table. |
| 148 | + |
| 149 | +``` |
| 150 | +$ accumulo shell |
| 151 | +user@instance> tables -l |
| 152 | +accumulo.metadata => !0 |
| 153 | +accumulo.replication => +rep |
| 154 | +accumulo.root => +r |
| 155 | +test.table1 => 3 |
| 156 | +test.table2 => 4 |
| 157 | +test.table3 => 5 |
| 158 | +trace => 1 |
| 159 | +user@instance> quit |
| 160 | +$ hdfs ec -setPolicy -policy RS-6-3-64k -path /accumulo/tables/3 |
| 161 | +Set RS-6-3-64k erasure coding policy on /accumulo/tables/3 |
| 162 | +$ accumulo shell |
| 163 | +user@instance> compact -t test.table1 |
| 164 | +``` |
| 165 | + |
| 166 | +### Defining Custom EC Policies |
| 167 | +Hadoop by default will enable only a single EC policy, which is |
| 168 | +determined by the value of the `dfs.namenode.ec.system.default.policy` |
| 169 | +configuration setting. To enable an existing policy, use the `hdfs ec -enablePolicy` |
| 170 | +command. To define custom policies, you must first edit the |
| 171 | +`user_ec_policies.xml` file found in the Hadoop configuration directory, |
| 172 | +and then run the `hdfs ec -addPolicies` command. For example, to add |
| 173 | +RS-6-3-64k as a policy, you first edit `user_ec_policies.xml` and add |
| 174 | +the following: |
| 175 | + |
| 176 | +```xml |
| 177 | +<configuration> |
| 178 | +<layoutversion>1</layoutversion> |
| 179 | +<schemas> |
| 180 | + <!-- schema id is only used to reference internally in this document --> |
| 181 | + <schema id="RSk6m3"> |
| 182 | + <codec>rs</codec> |
| 183 | + <k>6</k> |
| 184 | + <m>3</m> |
| 185 | + <options> </options> |
| 186 | + </schema> |
| 187 | +</schemas> |
| 188 | +<policies> |
| 189 | + <policy> |
| 190 | + <schema>RSk6m3</schema> |
| 191 | + <cellsize>65536</cellsize> |
| 192 | + </policy> |
| 193 | +</policies> |
| 194 | +</configuration> |
| 195 | +``` |
| 196 | +Here the schema "RSk6m3" defines a Reed-Solomon encoding with *k*=6 |
| 197 | +data blocks and *m*=3 parity blocks. This schema is then used to define |
| 198 | +a policy that uses RS-6-3 encoding with a stripe size of 64k. To add |
| 199 | +this policy: |
| 200 | + |
| 201 | +``` |
| 202 | +$ hdfs ec -addPolicies -policyFile /hadoop/etc/hadoop/user_ec_policies.xml |
| 203 | +2019-11-19 15:35:23,703 INFO util.ECPolicyLoader: Loading EC policy file /hadoop/etc/hadoop/user_ec_policies.xml |
| 204 | +Add ErasureCodingPolicy RS-6-3-64k succeed. |
| 205 | +``` |
| 206 | + |
| 207 | +To enable the policy: |
| 208 | + |
| 209 | +``` |
| 210 | +$ hdfs ec -enablePolicy -policy RS-6-3-64k |
| 211 | +Erasure coding policy RS-6-3-64k is enabled |
| 212 | +``` |
| 213 | + |
| 214 | +[Erasure Coding]: https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-hdfs/HDFSErasureCoding.html |
0 commit comments