Data mining: a new business process

Rochester Institute of Technology RIT Scholar Works Presentations and other scholarship 2000 Data mining: a new business process John Wang Qiyang C...
Author: Sheena Briggs
7 downloads 0 Views 1MB Size
Rochester Institute of Technology

RIT Scholar Works Presentations and other scholarship

2000

Data mining: a new business process John Wang Qiyang Chen Qiang Tu

Follow this and additional works at: http://scholarworks.rit.edu/other Recommended Citation Wang, John; Chen, Qiyang; and Tu, Qiang, "Data mining: a new business process" (2000). Accessed from http://scholarworks.rit.edu/other/376

This Conference Proceeding is brought to you for free and open access by RIT Scholar Works. It has been accepted for inclusion in Presentations and other scholarship by an authorized administrator of RIT Scholar Works. For more information, please contact [email protected].

/

MmMlmg' ;

IleJJa'J IJl~~m.D

in t,. rntem~t Ale --

The

Conference Proceedinfjs ,

Volume- I

J\," ',"

, "

Bo
""try .Y'

will need 10 be dc"elopcd 10 .-le 1M fullnl ..:!Vlr.lq< or rnoderll duo

)'
document *Il'ehoust In,lfJl.,ioru i. mutnni hundtf;cb of ,iph)'." 1mb)'!...

Of

pcub)'le. of Ilorage

c'plc~

ClIl be Involved. It can be the larl'" in,enmen, thai on cnto!;',;" will make in lhIt qu.ries and repon. e&n~ tlT.ttively n:'.11. T1>c lui Itn y.... have 1«1I I ne" SCI or tethnolOJitl e",trl' rllCuscd on UnllnlONr.d informal ion or documcnts, The problem of .."",hinll the .. manlie. of .lnIorured data iJ qUlle dlff.rec' from tne probl.", of ..archlnl the sernanliu of W1S1nICrur.d dolICumceu, The (hall.clle is to lnllyu """",,,rure
Over the Iasl

"'''''''&-

m-..

' ' &Ie

-

n

!:lilt d,tron '" thlI 110 *""'PI ha.... yet beUI ..fined l1sIn& c~. ",.. datI-
'0"'.

mo..

'0

fu'''''.

'"'-

Tho nUIlev.1 of ,nclertnn ernplo)1; dala mini", and pan.", m:opiuon lO rev..l ,he ""ros and pi""",,, l1I a,iven dI.. b.... p.od«llna: funtre ....1 on tho basis of these rules, This means tha' somehow tb.e doc:um.n" m"" be ml...:ed '0 lrId'honal 'OWS &rid columns. exactly wu, we .,..tryinllO lvoid, bec:_lbis mi.... !he d"a\)a... and _

I!Ie ..-ardloItv

What.,.. we loll .. 1Ih? Perhaps tho ~ all)' 10 doc:umenl m,olnS n the Web. H.... Intemot im.ag. and lex, ...arch ena"'es abound and vie f'" Ulel anen'ion. MOIl otr.r I Boolean ..arch ens",o software 10 ...1I up doc:Itmenu or in'o.est. More lOp!li,licaled po$'I-8001UII sean:h ""&11'1 softvt"ll'l wiUfind synonytnl. disam~. a..,c b1111d an ullemll It>esa\lJ\ll dyMmiall)' and 'nlerpm wtl 0lIl die disc'Mt ....,ll hI.c I ptfIfouod impk1 0lIl our ability lO =plIIId lO Iho J7U"'l'II IXa1 to exll'ad 111(_ _ from docUIIIerlU Wlltl the addition of ,,"w Web tecluoologics. Tho... technologies include melilip for !fTM1.. push technolo&)'. inteiliacn' aaents. and mullil\lnple bou!!.hll.., month, .... can figllt'e oUI whom is ready 10 buy a panlcular product nut month. Oata ... «o..cy The Iyerage markeling daal>a.se cn... ltI$ Ij percent bad data. ""d it', eosting bu. In• ...,, about 10 percent of th.ir r.yenu•. Thi' i' panicularly c""ly for direct marb, who depend on their databll$e' for melr livelihood. Maintaining data quality is proc.... &nd no datab6.se is perf~. NOI for long, &nyway. Do not ipot< the data quality problem because you can' fix il completely. Any improvement can have me..urable re.ul,...... slwlyo. you sllould .nhance your dala in the order ofgrealC't nwginal r=, I.e. Ifj'OU~e aboet to IIWlC:h a massive direct mail campaign, clean the addresses before worrying about the phooe numbers, In addition Ie cOMOlidation and eurreney, accuracy is anotl>er CQeCCSS of many dotl wmhousing projeCh os varialiOft.l in abbte""lioN, f~. etc..; mmpcllinp "\IUd by Slm,lariri.. dW1tl& .. I~ dati entry; OUlda,.d inform¥(ion ell/( to name ond Idd=.s CN.n...; and ll"aIUposiliom ruultJtlI ftr1. S.udlnliullion TIm.llo",. yoo '0 &ITUlCC cu,Wm., )n(o""ll; &Ild vui&lll spelling>. Enh&ll.emenl append" Doll and complele, mi..lnll informalion. Amonlllh< lypu or dau. thll C&Il b< IJ'pelllkd ar. eDIlw ttl own ""ique pn>file. The percenI or 1tIe sample popullllion .. thai. $OIft'O'll is Ihen compared 10 m..lts of d>e O'iendl populalioo. This allows the martetm& arWysllO re.new wbidlllftlllllO '"I"'"ftll pef1e overalll~ crcou.'one!)'. performs below !he oYa1In •.....,.. Pmoali_& Dealy Cle""ing your databu. is I Iar'&e KCompIWrmenl, but iI. wiU be shon·hed if)OCIde:D illformMion and dItabue ly>lCnn tomainl"l eVtlyotu"l from medic.1 Infonnl\lon to the 1tQ, And wIIil. lIWIy comPMi•• irnplem.nl.d IOph'SllClled Iysteml to mlllllS. record-bued dau.,!he ....uilion to .lectronic docum.nl mllllls.m.nl, dOCwtl.nl ... and dOCUnleOl mlnm, IllS bun slo....' an~ more naph&uraH in Grell $1uIpc?" Ditec1 MuketinJ,. Vol 62. No.5, 5ep 1999, PII· 16-23

Pus, $1.pntJI, "DilCovmna Val... in I MOIIII1&lo ofOa1&." ORJMS Too.y. Dc!. 1997.))p. 24-21. VlMClto, , _ J Russo, Andn.. W, "DatI Mi:liq Ind MocI
6

Inl.rn.....orbd 1>bnur1cturing' The Future Trend. io Malay.ia Zulkifli Mohamed Udin and Hartin; Anmad, U,,;wrsl/i Via", Molay>ia

14

Globllizing Bu.in... through lho [Dlor.,et in nenloplQg Coon,rits Pyung E. Hon. T,."Ift"n U",w,.,ty, USA

16

[ •• I".llng tb. Ad~ua