Working with files: Difference between revisions

From Computer Science Wiki
No edit summary
No edit summary
 
(23 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[file:Hierarchical-structure.png|right|frame|Object-Oriented Programming<ref>http://www.flaticon.com/</ref>]]
[[file:Hierarchical-structure.png|right|frame|Basics of programming<ref>http://www.flaticon.com/</ref>]]


We use files to store data. For example let us imagine you have a program that functions as a store inventory system. The program allows you to create, update, read and delete items from a store inventory. If you have no way of '''saving''' changes you've made, when you re-start your program, you will need to make all those changes again.
== File input / output ==


File [[Inputs and outputs|I/O]] is the process of reading '''from a file'' and writing '''to a file'''. These don't happen at the same time (you can't read from a file and write to the same file at the same time).  
We use files to store data. For example let us imagine you have a program which functions as a store inventory system. The program allows a user to create, update, read and delete items from a store inventory. If you have no way of '''saving''' changes you've made, when you re-start your program, you will need to make all those changes again.  


Some programs '''only''' use files to create, read, update and delete data.  For example, if we had a program which managed store inventory, '''every time''' you made a change (creating a new item, reading an item, updating something about an item, deleting an item) a file would be updated. The advantage of this method is in the event of unexpected power loss, your data would be very safe. The disadvantage of this method is your program would be much slower and files can be corrupted.  
File [[Inputs and outputs|I/O]] is the process of reading '''from a file''' and writing '''to a file'''. These don't happen at the same time (you can't read from a file and write to the same file at the same time).  


Some programs use internal data structures and then at certain moments save or read that data to a file. For example, if we had a program which managed store inventory, when we started our program the program would read the inventory from a file and  
Some programs '''only''' use files to create, read, update and delete data. For example, if we had a program which managed store inventory, '''every time''' you made a change (creating a new item, reading an item, updating something about an item, deleting an item) a file would be updated. One advantage of this method is in the event of unexpected power loss, your data would be very safe. Some disadvantages of this method is your program would be much slower and files can be corrupted.


Some programs use internal data structures and then at certain moments save or read that data to a file. For example, if we had a program which managed store inventory, when we started our program the program would read the inventory from a file and put that data into a data structure like a list or dictionary. We could then create, read, update or delete items. When we were ready (for example, exiting the program), we could save those changes into a file. One advantage of this method is the program would be much faster once you had loaded the data in the file into a data structure. One disadvantage might be if you forget to save you would lose all the changes you made since your last save.


== A superb video to help you understand this ==
<html>
<iframe width="560" height="315" src="https://www.youtube.com/embed/Uh2ebFW8OYM" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
</html>




== File input / output ==  
== Serialized and human-readable ==
 
Serialization is the process of translating data structures or object state into a format that can be stored (for example, in a file or memory buffer)...and reconstructed later (possibly in a different computer environment). When the resulting series of bits is reread according to the serialization format, it can be used to create a semantically identical clone of the original object.<ref>https://en.m.wikipedia.org/wiki/Serialization</ref>
 
For example, if you have a complex dictionary data structure with all the items in your store inventory you can serialize that dictionary so it can be '''perfectly''' reproduced. You can '''save''' the dictionary ''as a dictionary'' and then when you unserialize it (read from a file) you will have a dictionary object. One disadvantage of this is serialized files are not human-readable. 
 
A human-readable medium or human-readable format is a representation of data or information that can be naturally read by humans.<ref>https://en.m.wikipedia.org/wiki/Human-readable_medium</ref>


== File formats ==  
== File formats ==  


=== XML ===
=== XML ===
Please review the [[HTTP, HTTPS, HTML, URL, XML, XSLT, CSS#XML Definition|definition and characteristics of XML]] XML is a human-readable file format which is also able to be processed by a machine.  We could easily use XML for our store inventory system.
==== XML Example ====
I use this example with gratitude from W3 schools<ref>https://www.w3schools.com/xml/cd_catalog.xml</ref>.
<syntaxhighlight lang=xml>
<CATALOG>
<CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Columbia</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>1985</YEAR>
</CD>
<CD>
<TITLE>Hide your heart</TITLE>
<ARTIST>Bonnie Tyler</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>CBS Records</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1988</YEAR>
</CD>
<CD>
<TITLE>Greatest Hits</TITLE>
<ARTIST>Dolly Parton</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>RCA</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1982</YEAR>
</CD>
<CD>
<TITLE>Still got the blues</TITLE>
<ARTIST>Gary Moore</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>Virgin records</COMPANY>
<PRICE>10.20</PRICE>
<YEAR>1990</YEAR>
</CD>
<CD>
<TITLE>Eros</TITLE>
<ARTIST>Eros Ramazzotti</ARTIST>
<COUNTRY>EU</COUNTRY>
<COMPANY>BMG</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1997</YEAR>
</CD>
<CD>
<TITLE>One night only</TITLE>
<ARTIST>Bee Gees</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>Polydor</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>1998</YEAR>
</CD>
<CD>
<TITLE>When a man loves a woman</TITLE>
<ARTIST>Percy Sledge</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Atlantic</COMPANY>
<PRICE>8.70</PRICE>
<YEAR>1987</YEAR>
</CD>
</CATALOG>
</syntaxhighlight>


=== JSON ===
=== JSON ===
JavaScript Object Notation (JSON) is an open-standard file format or data interchange format that uses human-readable text to transmit data objects consisting of attribute–value pairs and array data types (or any other serializable value). It is a very common data format, with a diverse range of applications, such as serving as replacement for XML in AJAX systems.<ref>https://en.wikipedia.org/wiki/JSON</ref>
==== JSON Example ====
I use this example of JSON with gratitude from wikipedia <ref>https://en.wikipedia.org/wiki/JSON#JSON_sample</ref>
<syntaxhighlight lang=json>
{
  "first name": "John",
  "last name": "Smith",
  "age": 25,
  "address": {
    "street address": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postal code": "10021"
  },
  "phone numbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "fax",
      "number": "646 555-4567"
    }
  ],
  "sex": {
    "type": "male"
  }
}
</syntaxhighlight>
=== Plain Text ===
Plain text is usually delimited by an end-of-line character (usually invisible), so each discrete piece of data is on it's own line. You can structure this data anyway you want. That is wonderful and horrible at the same time.
==== Plain Text Example ====
oranges,5,123213
bananas,10,11129
grapes,13,090665
=== Difference between JSON and XML ===
I am grateful to Mr. Sagar Khillar for this excellent image <ref>http://www.differencebetween.net/technology/protocols-formats/difference-between-json-and-xml/</ref>
[[File:JSON-VERSUS-XML-.jpg|frame|none]]
A namespace is a set of symbols that are used to organize objects of various kinds, so that these objects may be referred to by name. A namespace ensures that all the identifiers within it have unique names so that they can be easily identified<ref>https://en.wikipedia.org/wiki/Namespace</ref>


=== Plain Text ===  
== Do I understand this? ==


If you are still stuck, or you have other questions,  you may want to [https://discuss.computersciencewiki.org/ '''ask a question on our discussion board'''].


== Helpful Links ==  
== Helpful Links ==  

Latest revision as of 07:36, 22 September 2021

Basics of programming[1]

File input / output[edit]

We use files to store data. For example let us imagine you have a program which functions as a store inventory system. The program allows a user to create, update, read and delete items from a store inventory. If you have no way of saving changes you've made, when you re-start your program, you will need to make all those changes again.

File I/O is the process of reading from a file and writing to a file. These don't happen at the same time (you can't read from a file and write to the same file at the same time).

Some programs only use files to create, read, update and delete data. For example, if we had a program which managed store inventory, every time you made a change (creating a new item, reading an item, updating something about an item, deleting an item) a file would be updated. One advantage of this method is in the event of unexpected power loss, your data would be very safe. Some disadvantages of this method is your program would be much slower and files can be corrupted.

Some programs use internal data structures and then at certain moments save or read that data to a file. For example, if we had a program which managed store inventory, when we started our program the program would read the inventory from a file and put that data into a data structure like a list or dictionary. We could then create, read, update or delete items. When we were ready (for example, exiting the program), we could save those changes into a file. One advantage of this method is the program would be much faster once you had loaded the data in the file into a data structure. One disadvantage might be if you forget to save you would lose all the changes you made since your last save.

A superb video to help you understand this[edit]


Serialized and human-readable[edit]

Serialization is the process of translating data structures or object state into a format that can be stored (for example, in a file or memory buffer)...and reconstructed later (possibly in a different computer environment). When the resulting series of bits is reread according to the serialization format, it can be used to create a semantically identical clone of the original object.[2]

For example, if you have a complex dictionary data structure with all the items in your store inventory you can serialize that dictionary so it can be perfectly reproduced. You can save the dictionary as a dictionary and then when you unserialize it (read from a file) you will have a dictionary object. One disadvantage of this is serialized files are not human-readable.

A human-readable medium or human-readable format is a representation of data or information that can be naturally read by humans.[3]

File formats[edit]

XML[edit]

Please review the definition and characteristics of XML XML is a human-readable file format which is also able to be processed by a machine. We could easily use XML for our store inventory system.

XML Example[edit]

I use this example with gratitude from W3 schools[4].

<CATALOG>
<CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Columbia</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>1985</YEAR>
</CD>
<CD>
<TITLE>Hide your heart</TITLE>
<ARTIST>Bonnie Tyler</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>CBS Records</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1988</YEAR>
</CD>
<CD>
<TITLE>Greatest Hits</TITLE>
<ARTIST>Dolly Parton</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>RCA</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1982</YEAR>
</CD>
<CD>
<TITLE>Still got the blues</TITLE>
<ARTIST>Gary Moore</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>Virgin records</COMPANY>
<PRICE>10.20</PRICE>
<YEAR>1990</YEAR>
</CD>
<CD>
<TITLE>Eros</TITLE>
<ARTIST>Eros Ramazzotti</ARTIST>
<COUNTRY>EU</COUNTRY>
<COMPANY>BMG</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1997</YEAR>
</CD>
<CD>
<TITLE>One night only</TITLE>
<ARTIST>Bee Gees</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>Polydor</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>1998</YEAR>
</CD>
<CD>
<TITLE>When a man loves a woman</TITLE>
<ARTIST>Percy Sledge</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Atlantic</COMPANY>
<PRICE>8.70</PRICE>
<YEAR>1987</YEAR>
</CD>
</CATALOG>

JSON[edit]

JavaScript Object Notation (JSON) is an open-standard file format or data interchange format that uses human-readable text to transmit data objects consisting of attribute–value pairs and array data types (or any other serializable value). It is a very common data format, with a diverse range of applications, such as serving as replacement for XML in AJAX systems.[5]

JSON Example[edit]

I use this example of JSON with gratitude from wikipedia [6]

{
  "first name": "John",
  "last name": "Smith",
  "age": 25,
  "address": {
    "street address": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postal code": "10021"
  },
  "phone numbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "fax",
      "number": "646 555-4567"
    }
  ],
  "sex": {
    "type": "male"
  }
}

Plain Text[edit]

Plain text is usually delimited by an end-of-line character (usually invisible), so each discrete piece of data is on it's own line. You can structure this data anyway you want. That is wonderful and horrible at the same time.


Plain Text Example[edit]

oranges,5,123213
bananas,10,11129
grapes,13,090665

Difference between JSON and XML[edit]

I am grateful to Mr. Sagar Khillar for this excellent image [7]

JSON-VERSUS-XML-.jpg

A namespace is a set of symbols that are used to organize objects of various kinds, so that these objects may be referred to by name. A namespace ensures that all the identifiers within it have unique names so that they can be easily identified[8]

Do I understand this?[edit]

If you are still stuck, or you have other questions, you may want to ask a question on our discussion board.

Helpful Links[edit]

References[edit]