<?xml version="1.0" encoding="utf-8"?><!DOCTYPE Zthes SYSTEM "http://zthes.z3950.org/schema/zthes-1.0.dtd">  <Zthes><term><termId>1920</termId><termName>data cleaning</termName><termType>TT</termType><termLanguage>en-GB</termLanguage><termVocabulary>DaLiCo Glossary</termVocabulary>	<termStatus>active</termStatus>	<termApproval>approved</termApproval>	<termSortkey>data cleaning</termSortkey><termNote label="Nota catalográfica"><![CDATA[ <p>Dirty data can lead to incorrect decisions and unreliable analysis. Examples of common errors include missing values, typos, mixed formats, replicated entries of the same real-world entity, and violations of business rules. Analysts must consider the effects of dirty data before making any decisions. It is often said that 80% of data analysis is spent on the process of cleaning and preparing the data.</p>
<p>Source: Dasu T, Johnson T (2003). <em>Exploratory Data Mining and Data Cleaning.</em> John Wiley &amp; Sons. <br />Online: <a href="https://www.wiley.com/en-us/Exploratory+Data+Mining+and+Data+Cleaning-p-9780471268512">https://www.wiley.com/en-us/Exploratory+Data+Mining+and+Data+Cleaning-p-9780471268512</a> </p> ]]></termNote><termNote label="Definition"><![CDATA[ <p>Data cleaning is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set, table, or database and refers to identifying incomplete, incorrect, inaccurate or irrelevant parts of the data and then replacing, modifying, or deleting the dirty or coarse data.</p>
<p>Wu, S. (2013). <em>A review on coarse warranty data and analysis.</em> Reliability Engineering and System.114: pages 1–11 <br />Online: <a href="https://doi.org/10.1016/j.ress.2012.12.021">doi:10.1016/j.ress.2012.12.021</a>.</p> ]]></termNote><termNote label="Definition"><![CDATA[ <p>Data cleaning is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set, table, or database and refers to identifying incomplete, incorrect, inaccurate or irrelevant parts of the data and then replacing, modifying, or deleting the dirty or coarse data.</p>
<p>Wu, S. (2013). <em>A review on coarse warranty data and analysis.</em> Reliability Engineering and System.114: pages 1–11 <br />Online: <a href="https://doi.org/10.1016/j.ress.2012.12.021">doi:10.1016/j.ress.2012.12.021</a>.</p> ]]></termNote><termCreatedDate>data cleaning</termCreatedDate><relation><relationType>UF</relationType><termId>2052</termId><termName>data cleansing</termName><termType>ND</termType></relation><relation><relationType>UF</relationType><termId>2053</termId><termName>Datenbereinigung</termName><termType>ND</termType></relation><relation><relationType>UF</relationType><termId>2054</termId><termName>gegevensopschoning</termName><termType>ND</termType></relation><relation><relationType>UF</relationType><termId>2055</termId><termName>adattisztítás</termName><termType>ND</termType></relation><relation><relationType>UF</relationType><termId>2056</termId><termName>limpieza de datos</termName><termType>ND</termType></relation><relation><relationType>UF</relationType><termId>2059</termId><termName>data tidying</termName><termType>ND</termType></relation><relation><relationType>RT</relationType><termId>2144</termId><termName>data preparation</termName><termType>PT</termType></relation><relation><relationType>RT</relationType><termId>1979</termId><termName>clean</termName><termType>PT</termType></relation><relation><relationType>RT</relationType><termId>1819</termId><termName>data management</termName><termType>PT</termType></relation><relation><relationType>RT</relationType><termId>1829</termId><termName>data wrangling</termName><termType>PT</termType></relation></term>  </Zthes>