Intermediate Spark Interview Questions and Answers
12. What is RDD Lineage?
Spark does not support data replication in memory and thus, if any data is lost, it is rebuilt using RDD lineage.
RDD lineage is a process that reconstructs lost data partitions. The best thing about this is that RDDs always remember how to build from other datasets.