Data loss

From Computer Science Wiki
System Fundamentals[1]

Data loss is the unplanned or unintended or accidental or purposeful corruption of data. Data is any stored information (user data, operating system data, etc.).

Causes of data loss[edit]

There are hundreds of reasons we can experience data loss. However, these can broadly fit into the categories below.

cause of data loss for example
malicious activities
  • External malicious actor who is deliberately causing data loss
  • Internal malicious actor who is deliberately causing data loss
  • Innocent user tricked into running malicious program which causes data loss
  • Deliberate sabotage by an actor (unplugging power, causing physical damage to the system
Natural disaster
  • Flood, lightning, storm, earthquake, tornado (also called an Act of God)
  • For something to be qualified as an act of God[2]:
    • (i) which involve no human agency
    • (ii) which is not realistically possible to guard against
    • (iii) which is due directly and exclusively to natural causes and
    • (iv) which could not have been prevented by any amount of foresight, plans, and care
system failure
  • Loss of ANY computer component (power supply, NIC card, temporary or permanent storage)
  • Loss of HVAC leading to heating / cooling issues

Consequences of data loss[edit]

The list below is an example to spur your thinking:

  • email is lost / gone
  • bank balance is wrong / gone
  • loss of your grades
  • loss of all your pictures
  • your music is gone

Preventing data loss[edit]

Strategy for prevention Description
failover systems a failover system is a standby/ redundant system which is used to eliminate or reduce downtime on users by automatically taking over if the primary system suddenly becomes unavailable. A failover system does not have to be on the same physical server.
redundancy when we have duplicate components. For example, 2 power supplies, extra memory (unused), 2 network interface cards, etc... This is VERY common in servers, as if one system fails another is immediately available.
removable media when we can remove data on a CD or tape drive we reduce the chance that a failure on the system will damage data, because the data isn't on the system.
offsite/online storage moving data to another physical location helps to protect it from problems in the original location. Online storage works the same way.

Real-world practical advice[edit]

I'm happy to report that data loss is rarer today than it was 10 years ago. However, without careful planning, we can be assured a data loss will occur. Whenever you design a system, you should include fault-tolerance, redundancy, offline/ online storage, and failover into your design.

Do you understand this material?[edit]

A small business has a computer kiosk inside the store which allows customers to sign up for a email newsletter. If a customer signs up for a newsletter inside the store, they will get a 10% discount on their first purchase at the store. The owner hopes this 10% discount will be an incentive for customers to sign up for the email newsletter. The business will then regularly email the customers special offers and savings. The business owner expects to benefit from this system by having increased sales. The customers expect to benefit from this system by having access to special offers, to save money, and to see what is new and trendy at their store.

  • Question 1: What would the consequences be if the system experienced data loss?
  • Question 2: How could the owner safeguard against data loss (please be very specific with your ideas)

Do you have an advanced understanding of this material?[edit]

A school of 900 students has a secure web-based application which manages attendance data. The school administrators carefully track attendance for the students so it can identify when students have been absent for a customizable threshold. For example, the school might set a threshold of 5 absences within 30 days, which then automatically notifies the student, parent, and teacher there is a problem with attendance. The threshold might be 3 times within 10 days, or something like that. The system keeps track of attendance and tardies. The system has customizable attendance codes. For example, "absence for school trip", "excused absence", "medical absence" are all allowed absence codes.

School administrators expect to benefit by having data about attendance so they can support students and parents to be in school. School administrators also expect to benefit by giving parents and students information about attendance (so parents can support their children to be in school). Finally, school administrators expect to benefit by using attendance data to apply for government funding (as they can prove how many students were in class on a specific day).

Parents expect to benefit by knowing when their children are in school or miss school. This way parents can support their children to be in school. Being in school is a shared value between the school and the parent.

Students expect to benefit by understanding how many days of school of they have missed. The school expects students to have a strong "ownership of learning" and manage their attendance.

  • Question 1: What would the consequences be if the system experienced data loss?
  • Question 2: How could the school safeguard against data loss (please be very specific with your ideas)

Standards[edit]

  • Identify a range of causes of data loss.
  • Outline the consequences of data loss in a specified situation.
  • Describe a range of methods that can be used to prevent data loss.


References[edit]