Kiara Candelario’s Expanded Definition of Database

To: Prof. Jason Ellis

From: Kiara Candelario

Date: March 26, 2021

Subject: Expanded Definition of Database

Introduction

The purpose of this document is to provide an expanded definition of the word “database.” I have chosen the word database because it has such an impact on our lives without us realizing it. Two definitions of the word database will be provided and will be compared. Also, two instances of the word being used will be provided and compare how the individuals are using the word. Lastly, a working definition is created based on the previous definitions and the context.

Definitions

According to Oxford Dictionary, a database is “A structured set of data held in computer storage and typically accessed or manipulated by means of specialized software.” (Oxford, 2021) The definition explains that it is an organized set of data that can be manipulated, stored, or accessed on a computer. The specialized software stated in the definition used to access and manage the data in a computer is called a database management system (DBMS). They are many database management systems that are used for databases. Some of the database management systems are MySQL, Oracle, Microsoft SQL Server, and PostgreSQL

According to Merriam-Webster, a database is “a usually large collection of data organized especially for rapid search and retrieval (as by a computer).” (Merriam-Webster,2021) The definition explains that it is a large amount of organized data that is accessible with the use of a computer. When searching for the data, it is searched and retrieved quickly. Databases make retrieving data faster. It is efficient for companies that have large amounts of data that need to be stored as well as accessed for future use and reference. For example, doctors can quickly get access to their patient information due to it being held on a database.

The Oxford Dictionary and Merriam-Webster definition both discuss that a database is basically an organized collection of data stored on a computer. Some differences are that the Oxford Dictionary sheds light on how specialized software is needed to manipulate and access the data, which is the use of a database management system. The Merriam-Webster definition sheds light on how large amounts of organized data is access quickly by search and retrieval. Both definitions are used in the technology industry as well as other industries.

Context

Harington states, “for the most part, today’s DBMS is intended as shared resources. A single database may be supporting thousands of users at one time.” Databases have the ability to allow many users to access the information that is in them at once.  It demonstrates how databases are efficient due to a single database supporting thousands of users simultaneously.  Although it allows many users to access the data, it is possible to have security in place so only a restricted number of individuals can access the data. For example, in a company, a database can be limited to specific individuals in order to prevent company breaches.

Randle states, “The Brooklyn district attorney’s office said DNA had helped solve 270 cases, including sexual assaults and homicides. The role of the database became a flash point in the trial of Chanel Lewis, the Brooklyn man convicted in April of murdering Karina Vetrano, a jogger in Queens.” The article discusses how police use a DNA database to capture criminals. The police get DNA samples from a crime scene and run them on the database to see if there is a match. Suppose there is a match, the person whose DNA it corresponds to shows up on the database, and all essential information needed for that person is there. The DNA database also has DNA from individuals that are not criminals.

Harington’s and Randle’s use of the word database is based on using a specific application or software with a database behind it to retrieve information like the individuals’ names in the database. The application makes it easier for the user to search for the items instead of retrieving them with code like SQL.

Working Definition

Based on the definition and the contextual use of the word ‘database,’ a database is organized data stored on a computer. It can be modified and retrieved with the help of a database management system. The data and the DBMS, and the application associated with it are called a database system. Only people who are authorized can have access to the data. Many applications and websites have a database behind them to retrieve and update information.

References

Harrington, J. L. (2009). Relational database design and implementation: Clearly explained. ProQuest Ebook Central https://ebookcentral.proquest.com

Merriam-Webster. (n.d.). Database. In Merriam-Webster.com dictionary. Retrieved February 23, 2021, from https://www.merriam-webster.com/dictionary/database?src=search-dict-hed#other-words

Oxford University Press. (n.d.). Database. In Oxford Dictionary. Retrieved February 23, 2021, from

https://www-oed-com.citytech.ezproxy.cuny.edu/view/Entry/47411?redirectedFrom=database#eid

Randle, A. (2019, August 16). Why the N.y.p.d.’s DNA database has some people worried. Retrieved March 26, 2021, from https://www.nytimes.com/2019/08/16/nyregion/newyorktoday/nypd-dna-database.html?searchResultPosition=12

Summary of Eyada et al.’s “Performance Evaluation of IoT Data Management Using MongoDB Versus MySQL Databases in Different Cloud Environments”

TO:      Prof. Ellis

FROM:    Kiara Candelario

DATE:    3/03/2021

SUBJECT: 500-Word Summary of Article About Comparing Non-Relational and Relational Databases.

The following is a 500-word summary of a peer-reviewed article about testing and comparing MongoDB and MySQL using IoT data on a virtual machine. The Internet of things is a system that consists of sensing, and collecting data, and it’s becoming a large aspect in many industries. According to the author, ” using IoT technology generates a large amount of heterogeneous data like texts, numbers, audio, videos, and pictures. These types of data need to be transferred, processed and stored” (Eyada et al., 2020, p. 110656). IoT data comes from different sources, and a database management system can assist with storing the amount of data that IoT creates. Relational DBMS’s use SQL,which is a popular system, but IoT data is heterogeneous, and it can negatively affect the database’s performance. NoSQL database, also known as a non-relational database, is the best option for IoT data due to storing unstructured data and is schema-free. NoSQL also has high scalability and availability. Cloud computing can deal with large amounts of data, and databases use cloud computing to improve consistency, availability, and tolerance.

MySQL is a relational database system that uses SQL to store data in tables and needs a pre-defined schema. Any change to the schema can hinder the performance and takes the database offline. MongoDB is a non-relational database system that is document-oriented, and it stores data as BSON objects. It has quick query access, and a structure does not need to be declared. MongoDB has different features that provide better performance based on long-term storage of large amounts of data and flexibility to work. The current experiment will solve the previous limitations that the other experiments had by enhancing both databases and not limiting the number of sensor nodes.

MongoDB and MySQL will store the IoT information, and it is base on the data collected from air pollution indoors and outdoors. In the MySQL database setup, two tables are created named station_location and town_name, which manage the station’s location and the sensor nodes. In the MongoDB Database Setup, two collections are made, where the first collection saves every station’s location. The second collection is the sensor table for all the sensors in the station. Node.JS is the server language that is used to process the collected data. Ubuntu 16. 04 LTS is the operating system installed on the virtual machine to setup MongoDB, MySQL, and Node.JS. Amazon Web Service’s Elastic Compute Cloud is the virtual machine that is used to establish the environment.

The experiment was conducted based on increasing the workload of each database latency, database size, and the number of sensor nodes. The impact of increasing the workload resulted in a latency decrease in the MongoDB database compared to the MySQL database. The impact of increasing the workload on database sizes demonstrates that MySQL outperforms MongoDB. Lastly, increasing the number of sensor nodes that connect to each station resulted in MongoDB outperforming MySQL significantly. The results demonstrate that MongoDB outperforms MySQL due to MySQL performance loss when increasing the workload.

Reference:

M. M. Eyada, W. Saber, M. M. El Genidy and F. Amer, “Performance Evaluation of IoT Data Management Using MongoDB Versus MySQL Databases in Different Cloud Environments,” in IEEE Access, vol. 8, pp. 110656-110668, 2020, doi: 10.1109/ACCESS.2020.3002164.