sql server - DQS How to cleanse duplicates in master table and correct IDs in all referenced tables -


we have lots of duplicates in master table. see below example, customer abc corp exists 3 times in master table , 3 custids referenced in orders table.

**customers table**  <table>  <tr><td><strong>custid</strong></td><td><strong>custname</strong></td></tr>  <tr><td>1001</td><td>abc corp.</td></tr>    <tr><td>1002</td><td>xyz corp.</td></tr>    <tr><td>1003</td><td>abc corp ltd.</td></tr>    <tr><td>1002</td><td>abc corporation limited.</td></tr>    </table>      **orders table**  <table>  <tr><td><strong>orderid</strong></td><td><strong>custid</strong></td></tr>  <tr><td>23425</td><td>1001</td></tr>    <tr><td>23466</td><td>1003</td></tr>    <tr><td>23488</td><td>1003</td></tr>    <tr><td>43877</td><td>1004</td></tr>    </table>

how can use data quality services remove duplicate customers customers table , update orders table reflect changes.

i.e. custids 1003 , 1004 merged 1001 , 1003 , 1004 orders table need updated 1001 too.

what have done far. created knowledge base customers synonyms custname , defined leading value. , created data cleansing project in dqs , analysed data , made corrections master data , imported corrected values customers knowledge base. understood dqs cleansing transformation in ssis automatically connecting source table , dqs kb , gives matching , non matching rows, still don't understand here. how data in customers table corrected , orders table updated accordingly. please suggest. examples have seen far using pretty basic stuff table , simple excel sheet.

geek,

when run sql server data quality service project cleansing last step wizard requests how export results.

if choose same source table target, replace original data new data after cleansing

but think best option export data temp table, execute sql query update validated changes

i had countrylist table. running dqs project data cleansing purpose , exported results (data cleansing info) new table countrylistdqs in sql server database

later, running sql update statement similar below can update original data validated changes dqs

update dbo.countrylist set     country = q.country_output dbo.countrylist c inner join dbo.countrylistdqs q on c.id = q.id_output q.country_status = 'corrected' 

Comments

Popular posts from this blog

java - Run spring boot application error: Cannot instantiate interface org.springframework.context.ApplicationListener -

reactjs - React router and this.props.children - how to pass state to this.props.children -

Excel VBA "Microsoft Windows Common Controls 6.0 (SP6)" Location Changes -