ABSTRACT A GRAPH BASED APPROACH FOR FINDING PEOPLE IN NEWS

2022-01-31 来源：易榕旅网

AGRAPHBASEDAPPROACHFOR

FINDINGPEOPLEINNEWS

athesis

submittedtothedepartmentofcomputerengineering

andtheinstituteofengineeringandscience

ofbilkentuniversity

inpartialfulfillmentoftherequirements

forthedegreeofmasterofscience

ByDeryaOzkanJuly,2007

IcertifythatIhavereadthisthesisandthatinmyopinionitisfullyadequate,inscopeandinquality,asathesisforthedegreeofMasterofScience.

Asst.Prof.Dr.PinarDuyguluS¸ahin(Advisor)

IcertifythatIhavereadthisthesisandthatinmyopinionitisfullyadequate,inscopeandinquality,asathesisforthedegreeofMasterofScience.

Asst.Prof.Dr.SelimAksoy

IcertifythatIhavereadthisthesisandthatinmyopinionitisfullyadequate,inscopeandinquality,asathesisforthedegreeofMasterofScience.

Prof.Dr.

ApprovedfortheInstituteofEngineeringandScience:

Prof.Dr.MehmetB.Baray

DirectoroftheInstitute

ABSTRACT

AGRAPHBASEDAPPROACHFORFINDING

PEOPLEINNEWS

DeryaOzkan

M.S.inComputerEngineering

Supervisor:Asst.Prof.Dr.PinarDuyguluS¸ahin

July,2007

Alongwiththerecentadvancesintechnology,largequantitiesofmulti-modaldatahasarisenandbecameprevalent.Hence,eﬀectiveandeﬃcientretrieval,organizationandanalysisofsuchdataconstitutesabigchallenge.Bothnewsphotographsonthewebandnewsvideosontelevisionformsthiskindofdatabycoveringrichsourcesofinformation.Peoplearemostlythemainsubjectofthenews;therefore,queriesrelatedtoaspeciﬁcpersonareoftendesired.

Inthisstudy,weproposeagraphbasedmethodtoimprovetheperformanceofpersonqueriesinlargenewsvideoandphotographcollections.Weexploitthemulti-modalstructureofthedatabyassociatingtextandfaceinformation.Ontheassumptionthataperson’sfaceislikelytoappearwhenhis/hernameismentionedinthenews,onlythefacesassociatedwiththequerynameareselectedﬁrsttolimitthesearchspaceforaqueryname.Then,weconstructasimilaritygraphofthefacesinthislimitedsearchspace,wherenodescorrespondtothefacesandedgescorrespondtothesimilaritybetweenthefaces.Amongthesefaces,therecouldbemanyfacescorrespondingtothequeriedpersonindiﬀerentconditions,posesandtimes.Therecouldalsobeotherfacescorrespondingtootherpeopleinthenewsorsomenon-faceimagesduetotheerrorsinthefacedetectionmethodused.However,inmostcases,thenumberofcorrespondingfacesofthequeriedpersonwillbelarge,andthesefaceswillbemoresimilartoeachotherthantoothers.Tothisend,theproblemistransformedintoagraphproblem,inwhichweseektoﬁndthedensestcomponentofthegraph.Thismostsimilarsubset(densestcomponent)islikelytocorrespondtothefacesofthequeryname.Finally,theresultofthegraphalgorithmisusedasamodelforfurtherrecognitionwhennewfacesareencountered.Inthepaper,ithasbeen

iii

shownthatthegraphapproachcanalsobeusedfordetectingthefacesoftheanchorpersonswithoutanysupervision.

Theexperimentsareperformedontwodiﬀerentdatasets:newsphotographsandnewsvideos.TheﬁrstsetconsistsofthousandsofnewsphotographsfromYahoo!newswebsite.Thesecondsetis229broadcastnewsvideosprovidedbyNISTforTRECVID2004.Imagesfromthebothsetsaretakeninreallifeconditionsand,therefore,havealargevarietyofposes,illuminationsandexpres-sions.Theresultsshowthatproposedmethodoutperformsthetextonlybasedmethodsandprovidescuesforrecognitionoffacesonthelargescale.

Keywords:Facerecognition,faceretrieval,SIFTfeatures.

¨OZET

HABERLERDEKIONEMLIYUZLER:ONEMLI

KISILERVEONEMLIYUZNOKTALARI

DeryaOzkan

BilgisayarM¨uhendisli˘gi,,Y¨uksekLisansTezY¨oneticisi:Yard.Do¸c.Dr.PinarDuyguluS¸ahin

Temmuz,2007

Bualmada,haberfotoraﬂarndanoluangeniverikmelerindekiilerinsorgulan-masnsalayanbiryntemsunulmutur.Yntemisimveyzlerinilikilendirilmesinedayanmaktadr.Haberbaslndakiininismigeiyorisefotoraftadaokiininyznnbulunacavarsaymyla,ilkolaraksorgulananisimileilikilendirilmi,fotoraﬂardakitmyzlerseilir.Buyzlerarasndasorgukiisineaitfarklkoul,pozvezaman-lardaekilmi,pekokresminyannda,haberdeismigeenbakakiilereaityzleryadakullanlanyzbulmayntemininhatasndankaynaklananyzolmayanresimlerdebulunabilir.Yinede,ouzaman,sorgukiisineaitresimlerdahaokolup,bures-imlerbirbirinedierlerineolduundandahaokbenzeyeceklerdir.Bunedenle,yzlerarasndakibenzerliklerizgeselolarakbetimlendiinde,birbirineenokbenzeyenyzlerbuizgedeenyounbileenolacaktr.Bualmada,sorguismiyleilikilendirilmi,yzlerarasndabirbirineenokbenzeyenaltkmeyibulan,izgeyedayalbiryntemsunulmaktadr.

Anahtars¨ozc¨ukler:Y¨uztanıma,y¨uzsorgulama,SIFT.

Acknowledgement

IwouldﬁrstliketoexpressmygratitudetomyadvisorPinarDuyguluS¸ahinforherguidanceandsupportthroughoutmystudies.Besideshervaluablecommentsandteaching,shewasmymainsourceofmoralesupportbyhighlymotivatingmeduringmygraduatestudy.Iamhonouredtobeherﬁrstmastersstudent.IamverythankfultoSelimAksoyforhissuggestionsandvaluablecomments.IwaspleasedtobeapartoftheRETINAteam,andhavingsuchanicefriendshipwiththegroupmembers.ThetwoyearswiththeminEA522willbeaverypleasantmemoryforme.

¨ITAK˙ThisresearchispartiallysupportedbyTUBCareergrantnumber104E065andgrantnumber104E077.

Contents

1Introduction1.1Motivation...............................1.2SummaryofContribution......................1.3

OrganizationoftheThesis......................

2Background2.1OnIntegrationofNamesandFaces.................2.2

OnFaceRecognition.........................2.2.1HolisticMethods.......................2.2.2LocalMethods........................2.2.3HybridMethods........................

2.3OntheUseofInterestPoints....................2.4

OntheUseofGraphTheoreticalMethodsinComputerVision..

3GraphBasedPersonFindingApproach3.1

Overview................................

vii

115811111415161718192121

CONTENTS3.2IntegratingNamesandFaces.....................3.3

ConstructingSimilarityGraphofFaces...............3.3.1GeometricalConstraint....................3.3.2UniqueMatchConstraints..................3.3.3

SimilarityGraphConstruction................

3.4GreedyGraphAlgorithmforFindingtheDensestComponent..3.5AnchorpersonDetectionandRemovalforNewsVideos......3.6

DynamicFaceRecognition......................3.6.1DegreeModeling.......................3.6.2

DistanceModeling......................

4Experiments4.1

DataSets...............................4.1.1NewsPhotographs......................4.1.2NewsVideos..........................

4.2EvaluationCriteria..........................4.3

ExperimentalResultsonNewsPhotographs............4.3.1MatchingPoints........................4.3.2GraphApproach.......................4.3.3

OnlineRecognition......................

4.4ExperimentalResultsonNewsVideos................

viii

22242627282931323233

36363637383939394040

CONTENTS4.4.1IntegratingFacesandNames................4.4.2AnchorDetection.......................4.4.3GraphApproach.......................

4.5AMethodforFindingtheGraphThresholdAutomaticaly....4.6

PerformanceAnalysis.........................

5Comparison5.1BaselineMethod...........................5.2

FeatureSelectionandSimilarityMatrixConstruction.......5.2.1FindingTrueMatchingPoints................

5.3

ExtractingSimilarGroupofFaces..................5.3.1k-nnApproach........................5.3.2

One-classClassiﬁcation....................

6ConclusionsandFutureWork6.1Conclusions..............................6.2

FutureWork..............................

ADiﬀerentFormsofNamesinNewsPhotographsix

4042424344

53535454555555

636365

ListofFigures

1.1Sampledetectedfacesinnewsphotographs[8],forwhichthenamePresidentGeorgeW.Bushappearsintheassociatedcaption....

1.2Samplenewsphotographsandtheirassociatedcaptions.Therecanbemorethanonefaceonthephotographormorethanonenameinthecaption..........................

1.3Sampleshotsfromnewsvideosandthespeechtranscripttextsalignedwiththoseshots........................

1.4Key-framesfromtwodiﬀerentvideos.Thenumbersbeloweachimageshowthedistancetoshot,inwhichthename’Clinton’ismentioned.Notethatinbothcases,Clintondoesnotappearvi-suallyintheshotinwhichhisnameismentionedbutappearsinpreceding(upimage)orsucceedingshots(bottomimage).....

1.5Samplefacesfromnewsphotographs[8](onthetop),andnewsvideos(onthebottom).

.......................

LISTOFFIGURES1.6Thefourstepsofouroverallapproach.Step1:Limitthesearchspaceofaquerypersontotheimagesthatareassociatedwiththequeryname.Step2:Constructasimilaritygraphoffacesinthissearchspace.Step3:Findthelargestdensestcomponentofthegraphcorrespondingtothefacesofthequeryperson.Step4:Usetheresultofthepreviousapproachasamodelinrecognizingnewfaces..................................

2.1Themainstepsinthefacerecognitionprocess(takenfrom[54])..3.1

Theﬁrstimageontheleftshowsallthefeaturepointsandtheirmatchesbasedontheminimumdistance.Thesecondimageontherightshowsthematchesthatareassignedastrueaftertheapplicationofgeometricalconstraints.................

3.2ForapairoffacesAandB,letA1andA2betwopointsonA;andB1isapointonBwiththearrowsshowingthematchesandtheirdirection.OntheleftisamultipleassignmentwherebothpointsA1andA2onAmatchB1onB.Insuchacase,thematchbetweenA2andB2iseliminated.OntherightisaonewaymatchwhereB1isamatchforA1,whereasB1matchesanotherpointA2onA.ThematchofA1toB1iseliminated.ThematchofB1toA2remainsthesameifB1isalsoamatchforA2;otherwiseitiseliminated................................

3.3Anexampleforuniquematchconstraint.Matchesfromthelefttotherightimageareshownbyred,dashedlines,whereasmatchesfromrighttoleftareshownbyyellowlines.Theleftimageshowsthematchesassignedafterapplyinggeometricalconstraints,butbeforeapplyingtheuniquematchconstraints.Therightimageshowstheremainingmatchesafterapplyingtheuniquematchcon-straints.................................

1015

LISTOFFIGURES3.4Examplesformatchingpoints.Notethat,evenforfaceswithdif-ferentsize,poseorexpressionsthemethodsuccessfullyﬁndsthecorrespondingpoints..........................

3.5Dissimilaritymatrixfor200imagesinthesearchspaceforthenameHansBlix.Inthissearchspace,97oftheimagesaretrueHansBliximages,andtheremaining103arenot.Forvisualization,the97trueBliximagesareputonthetopleftofthematrix.Darkcolorscorrespondtolargersimilarityvalues.............

3.6Exampleofconvertingaweightedgraphtoabinarygraph.Nodesandtheirdistancesaregivenintheﬁrstimage.Theresultinggraphafterapplying0.65astheproximitythresholdisgiveninthesecondimage.Boldedgesaretheedgesthatremainafterconversion.Theﬁnaldensestcomponentofthisgraphiscircledinthelastimage.

3.7Anchorpersonprobleminnewsvideos.Afterintegratingnamesandfacesforaqueryperson,itishighlyprobablethateitheroneoftheanchorpersonswillbereturnedasthedensestcomponentofthelimitedsearchspace.Theﬁrstﬁgureontheleftcorrespondstoasample..............................

4.1Namesof23peopleareusedintheexperiments.Thetotalnumberoffacesassociatedwithanameisrepresentedbyredbarsandnumberofcorrectfacesbygreenbars.

...............

4.2Recall-precisioncurveof23peopleinthenewsphotographsdataset.Precisionandrecallvalueschangedependingonthethreshold.Weusedthresholdvaluesbetween0.55and0.65toshowtheeﬀect.Thethresholdusedintherestofourexperimentsismarkedwithred.

..................................

xii

LISTOFFIGURES4.3Recallandprecisionvaluesfor23peopleforgraphthresholdvalueof0.575.Bluebarsrepresentrecallandredbarsrepresentprecisionvaluesthatareachievedwiththeproposedmethod.Greenbarsareprecisionvaluesforthebaselinemethod,whichdoesnotusethevisualinformationandretrievesthefaceswhennameappearsinthecaption.

............................

4.4Sampleimagesretrieved(ontheleft)andsampleimagesnotre-trieved(ontheright)forthequerynames:GeorgeBush(top),ColinPowell(middle),HansBlix(bottom)..............

4.5Recallandprecisonvaluesoftheheld-outset(ontheleft)andtheconstructedmodel(ontheright)foronlinerecognitionwithdegreemodelingmethod............................

4.6Recallandprecisonvaluesoftheheld-outset(ontheleft)andtheconstructedmodel(ontheright)foronlinerecognitionwithdistancemodelingmethod.......................

4.7Recallvaluesoftheheld-outsetandtheconstructedmodelforeachpersonforK=10........................

4.8Precisonvaluesoftheheld-outsetandtheconstructedmodelforeachpersonforK=10........................

4.9TheﬁgureshowsfrequencyofBillClinton’svisualappearancewithrespecttothedistancetotheshotinwhichhisnameismentioned.Left:whenthewholedatasetisconsidered,right:whenthefacesappearingaroundthenamewithintheprecedingandthefollowingtenshotsareconsidered.OverthewholedatasetClintonhas240facesand213ofthemappearintheselectedrange.........

4.10TherelativepositionofthefacestothenameforBenjaminNe-tanyahu,SamDonaldson,SaddamHussein,andBorisYeltsinre-spectively................................

xiii

LISTOFFIGURES4.11Detectedanchorsfor6diﬀerentvideos................4.12Recall-precisionvaluesforrandomlyselected10newsvideosfor

thresholdvaluesvaryingbetween0.55and0.65...........4.13Sampleimagesretrievedforﬁvepersonqueriesinexperiments.

EachrowcorrespondstosamplesforClinton,Netanyahu,SamDonaldson,Saddam,Yeltsinqueriesrespectively...........4.14Precisionsvaluesachievedforﬁvepeopleusedinourtests.....5.1RecognitionratesoftheeigenfacemethodfordiﬀerentKvalues.5.2

Examplesformatchingpoints.Firstcolumnformatchesfoundbyusingtheoriginalmatchingmetricofsift.Secondcolumnformatchesfoundbyapplyingtheproposedmethod..........5.3

Similaritymatrixfor201imagesinthesearchspaceforthenameHansBlix.Inthissearchspace,98oftheimagesaretrueHansBliximages,andtheremaining103arenot.Forvisualization,the98trueBliximagesareputonthetopleftofthematrix.Darkcolorscorrespondtolargersimilarityvalues.............5.4

Recall-precisioncurveof23peopleinthetestset.Precisionandrecallvalueschangedependingonthethreshold...........5.5

Examplesformatchingpointsassignedbythehomographymatrixfoundafterransac...........................5.6

Recognitionratesofthek-nnapproachfordiﬀerentKandkval-ues.Recallvaluesontheleftandprecisionontheright.Kisthepercentageoftheimagesusedfortesting,andkisthenumberofneighborsink-nn.

..........................

xiv

525254

LISTOFFIGURES5.7Similaritymatrixfor201imagesinthesearchspaceforthenameHansBlix.Inthissearchspace,98oftheimagesaretrueHansBliximages,andtheremaining103arenot.Forvisualization,the98trueBliximagesareputonthetopleftofthematrix.Darkcolorscorrespondtolargersimilarityvalues.............

5.8Recall-precisioncurveof23peopleinthetestset.Precisionandrecallvalueschangedependingonthethreshold...........

ListofTables

4.1RecognitionratesofdegreemodelingfordiﬀerentKvalues.(Kpercentoftheimagesareusedastheheld-outset............

4.2RecognitionratesofdistancemodelingfordiﬀerentKvalues.(Kpercentoftheimagesareusedastheheld-outset.........

4.3Numberoffacescorrespondingtothequerynameovertotalnum-beroffacesintherange[-10,10]and[-1,2]..............

4.4Numbersinthetableindicatethenumberofcorrectimagesre-trieved/totalnumberofimagesretrievedforthequeryname...

5.15.2

RecognitionratesoftheeigenfacemethodforfordiﬀerentKvalues.54RecognitionratesofsupervisedmethodfordiﬀerentKvalues.(Kisthepercentageoftheimagesthatareusedintesting;kisthenumberofneighboursused......................

5.3Recall-precisionratesoftwooneclassclassiﬁcationmethods:w1(nearestneighbordatadescriptionmethod)andw2(k-nearestneighbordatadescriptionmethod)(appliedontﬁdf’s).......

xvi

Chapter1Introduction

1.1

Motivation

Alongwiththerecentadvancesintechnology,largequantitiesofmulti-modaldatahasbecomemoreavailableandwidespread.Withitsemergence,eﬀectiveandeﬃcientretrieval,organizationandanalysisofsuchkindofbigdatahasbecomeachallengingproblemandarousedinterest.Newsphotographsonthewebandnewsvideosontelevisionaretwoexamplesofthistypeofdata.Theyacquirerichsourcesofinformationwherein.Hence,accessingthemisespeciallyimportantanditsimportancehasalsobeenacknowledgedbyNISTinTRECVIDvideoretrievalevaluation[1].

Peopleareusuallythemainsubjectofthestoriesinthenews.Therefore,queriesrelatedtoaspeciﬁcpersonisoftenadesiredtask.Ingeneralthesepeoplevisuallyappeararoundwhenhis/hernameismentionedinthenews.Onthisaccount,thecommonwaytoretrieveinformationrelatedtoapersonistosearchusinghis/hername.However,suchanapproachislikelytoyieldincorrectresultsduetothefollowingreasons:theremaybeotherpeopleinthesamestorythatalsoappearvisuallybesidesthequeryname,andthequerymaynotappearintheassociatedpictureevenifhis/hernameismentionedinthestory.Sotogetthetrueimagesofthequeryperson,afacedetectionshouldbeappliedtoextract

CHAPTER1.INTRODUCTION2

thedesiredfacesoftheperson.But,thismayalsocausesomenon-faceimagestoariseinthesearchresults,duetotheerrorsoffacedetectionalgorithmused.Figure1.1exempliﬁessomeofthedetectedfacesinnewsphotographsthatareassociatedwiththequerynamePresidentGeorgeW.Bush.Evenifthereappearsthefacesofthequeryperson,therealsoappearsthefacesofotherpeopleandsomenon-faceimagesthatarealsoassociatedwiththesamename.

Figure1.1:Sampledetectedfacesinnewsphotographs[8],forwhichthenamePresidentGeorgeW.Bushappearsintheassociatedcaption.

Newsphotographsappearwithanassociatedcaptionontheweb(seeFig-ure1.2).Therecouldbemorethanonefaceonthephotographormorethanonenameinthecaption.Thence,itisindeterminatethatwhichnamegoeswithwhichface.Similarlyinnewsvideos,thereisaspeechtranscriptalignedwitheachshot(seeFigure1.3).Theface-nameassociationproblemisalsoencoun-teredfortheseimages.Moreover,innewsvideosthereisusuallyatimeshiftbetweentheappearanceofanameandtheappearanceofthefacebelongingtothatname.Therefore,usingasingleshottemporallyalignedwiththespeechmayyieldincorrectresults.Anotherproblem,whichismoreimportant,isthatthemostfrequentfaceusuallycorrespondstotheanchorpersonorreporterratherthanthefaceofthequeryname(seeFigure1.4).Onthesegrounds,inordertoretrievethecorrectimagesofaparticularperson,visualinformationmustbeincorporatedandthefacesofthepersonneedtoberecognized.

Ontheotherhand,facerecognitionisalongstandingandadiﬃcultproblem(see[23,54,24,49,48]forrecentsurveys).Althoughmanydiﬀerentapproacheshavebeenproposedforrecognizingfaces,mostofthecurrentfacerecognitionmethodsareevaluatedonlyincontrolledenvironmentsandforlimiteddatasets.However,forlargerandmorerealisticdatasetslikenewsphotographsand/orvideos,facerecognitionisstilldiﬃcultanderror-proneduetothenoisyandcomplicatednatureofthesesets.Thesesetscontainlargevariationsinpose,

CHAPTER1.INTRODUCTION3

Figure1.2:Samplenewsphotographsandtheirassociatedcaptions.Therecanbemorethanonefaceonthephotographormorethanonenameinthecaption.illuminationandfacialexpression,whichcausethefacerecognitionproblemevenmorediﬃcult.

Recently,ithasbeenshownthattheperformanceofpersonqueriescanbeimprovedbyintegratingnameandfaceinformation[17,19,27,4,13,47].Whentextinformationisprovidedtogetherwiththevisualappearance,thefacerecog-nitionproblemcanbesimpliﬁedandtransformedintotheproblemofﬁndingassociationsbetweennamesandfaces[44,8,9].

Inthisstudy,weproposeamethodforimprovingtheperformanceofpersonqueriesinnewsdatasetsbyexploitingfrombothtextandvisualinformation.Asearchisstartedﬁrstbylookingforthenameofaquerypersoninthecaptionorinthespeechtranscripttext.Basedonthissearch,welimitoursearchspaceforthequerypersontotheimages/framesthatareassociatedwiththequeryname.Although,theremaybefacescorrespondingtootherpeopleinthestory,orsomenon-faceimagesduetothefacedetectionalgorithmused,thefacesofthequerypersonarelikelytobethemostfrequentlyappearingonesthananyotherpersoninitslimitedsearchspace.Eveniftheexpressionsorposesvary,diﬀerentappearancesofthefaceofthesamepersontendtobemoresimilartoeachotherthantothefacesofothers.Inotherwords,facesofthequeryperson

CHAPTER1.INTRODUCTION4

Figure1.3:Sampleshotsfromnewsvideosandthespeechtranscripttextsalignedwiththoseshots.

formsthelargestgroupofsimilarfacesinitslimitedsearchspace.

Inthiscontext,theproblemofﬁndingthefacesofthequerypersoninthelimitedsearchspaceisdualtothegraphproblemofﬁndingthelargestdensestcomponent.Tothisend,weﬁrstﬁndthesimilaritiesbetweenfacesinthesearchspaceandconstructasimilaritygraphoffaces,wherenodescorrespondtofacesandedgescorrespondtofacesimilarities.Then,wetransformourproblemtoagraphproblemofﬁndingthelargestdensestcomponentofthegraph.Thislargestdensestcomponentreferstothebiggestsetofhighlyconnectednodesinthegraph;thuslargestgroupofsimilarfacescorrespondingtothefacesofthequeryperson.Whenweﬁndthefacesofthequerypersonwiththeproposedapproach,thereturnedsolutionisalsousedasamodelforrecognizingnewfaces.Diﬀerentfromthemostofcurrentfacerecognitionsystems,weﬁndsimilaritybetweenthetwofacesbasedontheSIFTfeaturesextractedfromthosefaces.Theproposedmethodexploitsfromthescaleandilluminationinvariancechar-acteristicsofSIFTfeatures.Besides,italsoovercomestheproblemsinholisticappearanceorlocalfacialfeaturebasedfacerecognitionmodelsbybeingless

CHAPTER1.INTRODUCTION5

Figure1.4:Key-framesfromtwodiﬀerentvideos.Thenumbersbeloweachimageshowthedistancetoshot,inwhichthename’Clinton’ismentioned.Notethatinbothcases,Clintondoesnotappearvisuallyintheshotinwhichhisnameismentionedbutappearsinpreceding(upimage)orsucceedingshots(bottomimage).

sensitivetovariationsinnoise,occlusion,andillumination;andbyworkingalsointheabsenceofanyofthefacialfeatures.

Theproposedmethodisnotasolutiontothegeneralfacerecognitionprob-lem.Rather,itisamethodtoincreasetheretrievalperformanceofthepersonqueriesinthelargedatasetswherenamesandfacesappeartogetherandwheretraditionalfacerecognitionsystemscannotbeused.Itdoesnotrequireatrainingstepforaspeciﬁcpersonandtherefore,thereisnolimitonthenumberofpeoplequeried.

Inthefollowingtwosection,webrieﬂydescribetheoverallapproach,andthenpresenttheorganizationofthesis.

1.2SummaryofContribution

Ourpersonﬁndingapproachisbasedonﬁrstlimitingthesearchspaceofaquerypersonbyusingtextinformationandthensolvingtheproblembytransforming

CHAPTER1.INTRODUCTION6

ittoagraphproblem.Afterﬁndingthefacesofthequeryperson,wecanusetheresultasamodelforrecognizingnewfaces.Theoverallapproachconsistsoffoursteps:constructingalimitedsearchspaceforaquerypersonbyusingtextandnameinformation,deﬁningsimilaritiesbetweenfacesinthissearchspacetoformasimilaritygraphoffaces,ﬁndingthedensestcomponentofthisgraphwhichcorrespondstothefacesofthequeryperson,andﬁnallyusingtheresultasamodelfurtherinrecognizingnewfaces.ThemainstepsoftheproposedmethodaregiveninFigure1.6.

Intheﬁrststep,weusethetextinformationtolimitoursearchspaceforaqueryname.Itconsistsofsearchingforthequerynameinthecaptionorinthespeechtranscript,andchoosingtheimages/framesthatareassociatedwiththequeryname.Asstatedearlier,thereisusuallyatimeshiftbetweentheappearanceofanameandtheappearanceofthefacebelongingtothatnameinnewsvideos.Thisproblemishandledbytakingawindowaroundthename.Thesolutionis,ratherthansearchingthefacesonlyontheshotsincludingthenameoftheperson,alsotoincludetheprecedingandsucceedingshots.

Weassumethatinthislimitedsearchspace,facesofthequerypersonappearmorethanfacesofanyotherperson.Also,facesofthesamepersontendtobemoresimilartoeachotherthantothefacesofothers.Therefore,thefacesofthequerypersonformsthelargestgroupofsimilarfacesinthelimitedsearchspace.Hence,inthesecondstep,weassignasimilaritymeasurebetweeneachpairoffacesinthelimitedsearchspaceandrepresentthesesimilaritiesamongfacesinagraphstructure.Inthissimilaritygraph,nodescorrespondtothefacesandtheedgeweightscorrespondtotheassignedsimilarities.

Inthisstudy,thesimilaritybetweenfacesisrepresentedbyusinginterestpointsextractedfromtheface.WeuseLowe’sSIFTfeatures,whichhavebeenshowntobesuccessfulinrecognizingobjects[35,39,26]andfaces[28].DiﬀerentfromtheoriginalmatchingmetricofSIFT,weassigntheinterestpointshav-ingtheminimumdistanceastheinitialmatchingpoints.Then,weapplytwoconstraintonthesematchedpoints:geometricalconstraintanduniquematchconstraint,toeliminatethefalsematchesandselectthebestmatchingpoints.

CHAPTER1.INTRODUCTION7

Finally,theaveragedistanceofmatchingpointsbetweentwofacesisusedtoassignthedistancebetweenthesetwofaces.UsingtheSIFTfeatures,webothexploitfromthescaleandilluminationinvariantpropertyofthesefeaturesandareabletoassignadistancemetricbetweentwofacesevenintheabsenceofanyoffacialfeatures.

Intheconstructedsimilaritygraph,thenodescorrespondingtothefacesofthequerypersonwillbemorestronglyconnectedtoeachotherthantoothernodescorrespondingtootherfacesinthegraph.Moreover,thequerypersonistheonewhosefaceusuallyappearsthemostfrequentlyinitssearchspace.Consideringallthese,theproblemistransformedintoagraphprobleminthethirdstepandsolvedbyagreedygraphalgorithmthatreturnsthelargestdensestcomponentofthegraph.Thislargestdensestcomponentreferstothesetofhighlyconnectednodesinthegraph;thusthesetofmostsimilarfacescorrespondingtothefacesofthequeryperson.

Theﬁnalstepinvolvesusingtheresultofthegraphalgorithmasamodeltorecognizenewfaces.Withtheproposedgraphalgorithm,wecanautomaticallyobtainthelabeledtrainingsetforlearningthemodelinrecognizingnewfaces.Weproposetwodiﬀerenttechniquestousetheresultbasedon:degree,distance,andmatchnumbermodeling.Inallthosetechniques,themodelistrainedbyusingthereturnedgraphasthetrainingset.

Themethodhasbeenappliedontwodiﬀerentnewsdatasets:newspho-tographsandnewsvideos.Theﬁrstdataset,namelythenewsphotographscollectedbyBergetal.[8],isquitediﬀerentfrommostoftheexistingdatasets(seeFigure1.5).Itconsistsoflargenumberofphotographswithassociatedcap-tionscollectedfromYahoo!NewsontheWeb.Photographsaretakeninreallifeconditionsratherthaninrestrictedandcontrolledenvironments.Therefore,theyrepresentalargevarietyofposes,illuminationsandexpressions.Theyaretakenbothindoorsandoutdoors.Thelargevarietyofenvironmentalconditions,occlusions,clothing,andagesmakethedatasetevenmorediﬃculttoberecog-nized.

TheseconddatasetisbroadcastnewsvideosprovidedbyNISTforTRECVID

CHAPTER1.INTRODUCTION8

Figure1.5:Samplefacesfromnewsphotographs[8](onthetop),andnewsvideos(onthebottom).

2004videoretrievalevaluation[1].Duetothehighernoiselevelandlowerres-olution,newsvideosisaharderdatasettoworkwith.Inordertohandletheproblemduetoanchorpersonfaces,weaddamechanismtodetectandremovetheanchorpeople.Theanchorpersonistheonewhoappearsthemostfrequentlyineachnewsvideo.Hence,weapplythedensestcomponentalgorithmtoeachvideoseperatelyandautomaticallydetecttheanchorperson,sincethefacesoftheanchorpersonwillcorrespondtothebiggestdensestcomponentofthegraphformedbythefacesinthatvideo.

1.3OrganizationoftheThesis

Therestofthepaperisorganizedasthefollowing:

InChapter2,wesummarizerelatedpreviousworksonthesimilarproblems.inChapter3,wepresentadetailedexplanationoftheoverallgraphapproach:as-sociationofnamesandfacestolimitthesearchspaceforaqueryname,deﬁnitionofthesimilaritymeasuretoconstructthesimilaritygraph,ﬁndingthedensestcomponentofthegraph,andusingtheobtainedresultinrecognizingnewfaces.Themethodisalsoappliedonnewsvideosfordetectingtheanchorpeopleauto-matically.InChapter4,experimentalresultsaregivenalongwiththedescriptionofthedatasetsused.InChapter5,theproposedmethodiscomparedwithsomeotherpossiblemethods.Finally,wesummarizetheoverallworkinChapter6and

CHAPTER1.INTRODUCTION9

concludewithfutureresearch.

CHAPTER1.INTRODUCTION10

Figure1.6:Thefourstepsofouroverallapproach.Step1:Limitthesearchspaceofaquerypersontotheimagesthatareassociatedwiththequeryname.Step2:Constructasimilaritygraphoffacesinthissearchspace.Step3:Findthelargestdensestcomponentofthegraphcorrespondingtothefacesofthequeryperson.Step4:Usetheresultofthepreviousapproachasamodelinrecognizingnewfaces

Chapter2Background

Inthisthesis,weproposemethodforﬁndingandrecognizingfacesinnewsbyintegratingnamesandfaceswithagraphbasedapproach..Inthefollowingsections,weﬁrstdiscusssomeofthepreviousworkontheuseofnameandtextinformation.Then,wedeﬁnetheproblemoffacerecognitionandimportanceofinterestpointsinliteratureonsolvingtherecognitionproblem.Laterinthelastsection,webrieﬂytalkontheuseofgraphtheoreticalmethodsincomputervision.

2.1OnIntegrationofNamesandFaces

Newsvideoandnewsphotographcollectionspossessdiﬀerenttypesofresources,suchastext,speech,andvisuality.Recently,ithasbeenshownthateﬀectiveandeﬃcientaccessingandutilizationofsuchmulti-modaldatacanbesimpliﬁedbyintegratingthediﬀerenttypesofresources.Inmanyofthepreviousworksinliterature,namesandfacesareassociatedforbetterqueryresults.

In[52],Yangetal.showedthatthecombinationoftextandfaceinformationallowsbetterretrievalperformancesinnewsvideos.Inthatwork,theyavoidthediﬃcultiesoffacerecognitionbyusingthetextinformationforselecting

CHAPTER2.BACKGROUND12

someshotsastheinitialresultsandapplyingfacerecognitiononthoseshots.Bythisway,theyaimtoreducethenumberoffacestoberecognized,henceimprovetheaccuracy.Thetimingbetweennamesandappearancesofpeopleismodeledbypropagatingthesimilarityscoresfromtheshotscontainingthequeryperson’snametotheneighboringshotsinawindow.TheEigenfacealgorithmisusedforfacerecognition.Inthepaper,ananchordetectorhasalsobeenbuiltbycombiningthreeinformationresources:colorhistogramfromimagedata,speakerIDfromaudiodata,faceinfofromfacedetection.TheyuseFisher’sLinearDiscrimant(FLD)toselectthedistinguishingfeaturesforeachsourceofdata.Then,theyunitetheselectedfeaturesintoanewfeaturevectortobeusedinclassiﬁcation.Uponasimilarapproach,[16]usesthetextandimagefeaturestogethertoiterativelynarrowthesearchforbrowsingandretrievalofwebdocuments.

[14]alsouniﬁesthetextualandvisualstatisticsinasingle

indexingvectorforretrievalofwebdocuments.

Bergetal.[9]proposedamethodforassociatingthefacesinthenewspho-tographswithasetofnamesextractedfromthecaptions.Inthatpaper,theyﬁrstperformkernelprincipalcomponentanalysis(kPCA)toreducethedimen-sionalityoftheimagedataandlineardiscriminantanalysis(LDA)toprojectthedataintoitsdiscriminantcoordinates.EachimageisthenrepresentedwithbothavectorgainedafterthekPCAandLDAprocesses,andasetofassociatednamesextractedfromthecaption.Amodiﬁedversionofk-meansclusteringisusedtoassignalabelforeachimage.Clustersthatarefarawayfromthemeanareremovedfromthedata,anddiscriminantcoordinatesarere-estimated.Finally,clusterswhichshowhighfacialsimilarityaremerged.Theresultsgiveninthiswork,arethenimprovedin[8]byanalyzinglanguagemorecarefully.Inthelatterwork,theyalsolearnanaturallanguageclassiﬁerthatcanbeusedtodeterminewhoispicturedfromtextalone.

[19]integratesnamesandfacesusingspeechtranscripts,andimprovestheretrievalperformanceofpersonqueriesonTRECVID2004.Itﬁrstsearchesoverthespeechtranscripttextandselectsthekey-framesthatarealignedwiththequeryname.Then,itappliesSchneiderman-Kanade’sfacedetectionalgorithmoneachkey-frame.However,manyfalsepositivesareproducedwithsuchan

CHAPTER2.BACKGROUND13

approach.Hence,skincolorinformationisusedtoeliminatethefalsepositives.TheprobabilityofapixelbeingaskinpixelismodeledusingaGaussianproba-bilitydistributiononHSVcolorspace.Then,threefeatures(colorfeature,PCA,ICA)areextractedfromthefacestobeuseingroupingsimilarfaces.Whilegrouping,itusesG-meansclusteringandstartsfromsmallnumberofclusters,K,andincreasesKiterativelyifsomeclustersfailtheGaussianitytest(Kolmogorov-Simirovtest)[25].Aftergrouping,eachclusterisrepresentedbyonerepresen-tativeface:theonethatistheclosesttothemeanofitscluster.Hence,theproposedmethodincreasesthespeedofthesystembyreducingthenumberofimagesprovidedtotheuser.Anchorﬁlteringisalsoexperimentedbyselectingtheanchorrepresentativesandremovingthemfromtherestofthecluster.In[2]amethodforautomaticallylabelingofthecharactersinTVorﬁlmbyusingbothvisualandtextualinformation.First,italignsthesubtitlesandthescriptfortaggingsubtitleswithspeakeridentity.Itdetectsfacesandthentrackstocomposefacetracks.Whileobtainingthefacetracks,thenumberofpointtrackswhichpassthroughfacesiscounted,andifthisnumberislargerelativetothenumberofpointtrackswhicharenotincommontobothfaces,amatchisdeclared.Torepresenttheappearanceofaface,ninefacialfeaturesarelocatedandtwolocalfeaturedescriptorsareused:siftandsimplepixel-wiseddescriptorformedbytakingthevectorofpixelsintheellipticalregionandnormalizingtoobtainlocalphotometricinvariance.Furthertousetheadditionalcues,aboundingboxispredictedforeachface,whichisexpectedtocontaintheclothingofthecorrespondingcharacter.Speakeraredetectedbyasimplelipmotiondetectionalgorithm.Then,eachunlabeledfacetrackisrepresentedbyasetoffaceandclothingdescriptors.Finally,aprobabilisticmodelisusedtoclassifythesetracksbasedontheweightedprobabilitiesofthefaceandclothappearancefeatures.

CHAPTER2.BACKGROUND14

2.2OnFaceRecognition

Facerecognitionisalong-standingandadiﬃcultproblem,whichhasreceivedgreatattentionduetoitsdemandondiﬀerentﬁeldslikecommercialandlawenforcementapplications.Theproblemhasarousedinterestinvarioussciencedomainssuchaspatternrecognition,computervision,imageanalysis,psychology,andneurosciences.Thereisadirectrelationamongtheﬁndingsinanyofthosescience.Asstatedin[54],thereexistsstillmanyambiguousquestionsinvolvedintheprocessofhumanperceptionoffacesthatpsychologistsandneuroscientistsworkon,suchas:

•Isfacerecognitionauniqueanddiﬀerentfromobjectrecognition?•Dowerecognizefacesasawholeorbyindividualfeatures?•Whichfeaturesaremoreimportantforfacerecognition?

•Ishumanfaceperceptioninvarianttochangesinview-point,lightingand/orexpressions?

Inthecontextofcomputervision,abroadstatementofthefacerecognitionproblemisdeclaredasthethis:Givenastilloravideoimage,identifyorverifyoneormorepersonsintheimagebyusingastoreddatabaseoffaces([54,49]).Therearethreemainstepsinvolvedintheconﬁgurationofagenericfacerecog-nitionsystem(seeFigure2.1):detectionofthefaceregionintheimage,featureextractionfromthefaceregion,recognition.

In[54]andin[48],thefacerecognitionmethodshavebeenputintooneofthethreecategories:holisticmethods,localmethodsorhybridmethods.Wegiveabriefsummaryofthethosethreeapproachesinthenextthreesubsections.

CHAPTER2.BACKGROUND15

Figure2.1:Themainstepsinthefacerecognitionprocess(takenfrom[54]).

2.2.1HolisticMethods

Holisticmethodsusetewholefaceastherawinputtothesystem.Inthosemethods,eachfaceimageisrepresentedwithavectorcomposedbyconcatenat-ingthegray-levelpixelvaluesoftheimage.Amongtheholisticmethods,themostdeclaredtechniquesusedarePrincipalComponentAnalysis(PCA),FisherDiscriminantAnalysis(FDA),LinearDiscriminantAnalysis(LDA),andInde-pendentComponentAnalysis(ICA).

ThemainideainPCAistoprojectthetrainingdataintoasub-dimensionalfeaturespace,inwhichbasisvectorscorrespondtothemaximumvariancedi-rectionintheoriginalimagespace.EachPCAbasisvectorwasreferredasan”eigenface”in[36,37]thatcanbedisplayedasasortofghostlyface.Intheclas-siﬁcationstage,theinputfaceimageisﬁrstprojectedintothesubspacespannedbytheeigenfaces;whereeachfaceisrepresentedbyalinearcombinationoftheeigenfaces.Then,thenewfaceisclassiﬁedaccordingitspositioninthefacespacetothepositionsoftheknownindividuals.Laterinliterature,extensionsofthePrincipalComponentAnalysishavebeenproposedasin[53]byusingtwo-dimensionalPCAandin[50]byselectingdiscriminanteigenfacesforface

CHAPTER2.BACKGROUND16

recognition.

In[5],Bartlettetal.claimedthatICArepresentationsoffacesaresuperiortoPCA;henceICAcanperformbetterperformancesacrosssessionsandchangesinexpression.InICA,dataisprojectedontosomebasisthatarestatisticallyindependent.Italsoattemptstominimizessecond-orderandhigher-orderde-pendenciesintheinputdata.

Holisticmethodscanbeadvantageousinthecontextofcoveringtheglobalappearanceofthefaces.Oneotheredgeoftheholisticmethodsisthattheymaintainthediﬀusetextureandshapeinformation;hencecandiﬀerentiatethefaces.However,representingeachfaceimagebyafeaturevectormakestheholis-ticmethodssensitivetovariationsinappearancecausedbyocclusion,changesinilluminationand/orexpressions.Toovercomethisproblem,localfeaturerepre-sentationshavebeendevelopedinliteraturethatarelesssensitivetochangesinappearance.

2.2.2LocalMethods

Theimportanceofhair,eyesandmouthinfacehumanperceptionhasbeenhighlightedinpsychologyandneurosciences([29,11]).Ithasalmostbeenclaimedin[11]thatnoseplaysanimportantroleinfacerecognition.Onthesegrounds,localmethodshavebeenpresentedthatrepresenteachfaceimagebyasetoflowdimensionalfeaturevector,whichusuallycorrespondtothefacialfeatureslikeeyes,mouth,andnose.

Mainly,therearetwoapproachesusedinlocalmethods:localfeature-basedmethodsandlocalappearance-basedmethods.Thegeometricrelationsofthefeaturesareconsideredinlocalfeature-basedmethods.Insomeprimitivestudiesofthistypes,onlythegeometricmeasures–suchasdistancebetweentheeyesorthesizeoftheeyes–havebeenconsidered[32,31].Then,in[38]Manjunathetal.proposedamethodthatpreservesboththelocalinformationandtheglobaltopologyoftheface.Themethodstoresboththelocalinformationandthe

CHAPTER2.BACKGROUND17

featureinformationofthedetectedfacialfeatures;andconstructsatopologicalgraphusingthesefeatures.Then,theclassiﬁcationstage,theproblemissolvedbyagraphmatchingscheme.

Evenifthemethodin[38]isadvantageoussinceitconsidersboththesimilarityofthelocalfeaturesandtheglobaltopology,itissensitivevariationsthatcausesanychangeinthistopology.Consequently,Ladesetal.[12]proposedElasticBunchGraphMatchingScheme(EBGM)thatisbasedonadeformabletopologygraph.Eventhoughthemethodresolvestheproblemcausedbythechangesinappearance,itcannotremovetheproblemcausedbytheocclusionofanyoffacialfeatures.

Motivatedbythefoundingsinpsychologythatasetofsimplelinescanchar-acterizethestructureofanobject,henceissuﬃcienttorecognizeitsshape;afaceisrepresentedbytheFace-ARGin[42].Face-ARGconsistsofasetofnodesthatcorrespondstolinefeatures,andbinaryrelationsbetweenthem.UsingaFace-ARG,allthegeometricquantitiesandstructuralinformationofthefacecanbeencodedinanAttributedRelationalGraph(ARG)structure.Then,intheclassiﬁcationstage,partialARGmatchingisappliedontheconstructedFace-ARG’softhetestandreferencefaces.Theadvantageofthemethodoverrecentmethodsisthatitcanstillworkinpresenceofocclusionandexpressionchanges.Mostofthelocalfeature-basedmethodsaresensitivetoaccuratefeaturepointlocalization;whichisstillanunresolvedproblem.Hence,inlocalappearance-basedmethods,facialfeatureregionsaredetectedﬁrstandfeatures(suchasGaborwavelet[38]andHaarwavelet[34])areextractedfromtheseregionsintheimage.Diﬀerentclassiﬁersareappliedonthesefeaturesﬁnallyintheclassiﬁcationstep.

2.2.3HybridMethods

Itisstillaquestionifwerecognizefacesasawholeorbyindividualfeatures.Holisticmethodscovertheglobalfaceappearanceandcanmaintainthediﬀuse

CHAPTER2.BACKGROUND18

textureandshapeinformation.However,theyareverysensitivetothechangesinappearancethatmaybecausedbyocclusion,changesinilluminationand/orexpressions.Localmethodsislesssensitivetothosechanges,howeveraccuratelocalizationoffacialfeaturepointsstillremainsthatcanaﬀecttheperformanceofrecognition.Especiallyforimagesofsmallsize,facialfeaturedetectionisevenmoreproblematic;hencethelocalmethodscannotbeappliedinthatcase.Howtousethelocalfeaturestowithoutdisregardingtheglobalstructureisanotheropenproblem.Toovercomethoseproblemsandbeneﬁtfromdiﬀerentadvantagesofbothapproaches,hybridmethodscanbeused.Hybridmethodsaimtousebothholisticandlocalapproaches.However,howtousewhichfeaturesandwhichclassiﬁersinhybridmethodsisstillindeterminateandnotmuchinvestigatedinliteratureyet.

2.3OntheUseofInterestPoints

Recently,ithasbeenshownthatlocalimagefeaturesprovideagoodrepresenta-tionoftheimageforrecognitionandretrieval[6,41],whileglobalfeaturestendtobesensitivetoimagevariationsinviewpoint,poseandillumination.Amongthelocalfeatures,ScaleInvariantFeatureTransform(SIFT)techniquehasbeenshowntobesuccessfulinrecognizingobjects[35,39,26]andfaces[28,10].Thetechnique,whichhasﬁrstbeenpresentedin[35],providesdistinctiveinvariantfeaturesthatcanbeusedinreliablematchingbetweendiﬀerentimagesofanob-jectorscene.Themostpowerfulaspectofthesefeaturesisthattheyareinvarianttoimagescaleandrotation,andpartiallyinvarianttochangeinilluminationand3Dcameraviewpoint.

In[10],applicationoftheSIFTapproachinthescopeoffaceauthenticationhasbeeninvested.Therediﬀerentmatchingschemesareproposedinthepaper:1.Minimumpairdistance,2.Matchingeyesandmouth,3.Matchingonaregulargrid.Theﬁrstschemeproposestocomputethedistancesbetweenallpairsofkeypointsintwoimagesandassigntheminimumdistanceasthematchingscore.Usingthethatfaceandmouthregionsprovidemostoftheinformationforface

CHAPTER2.BACKGROUND19

recognition,onlythefeaturesintheseregionsareusedinthesecondscheme.Finally,featurelocationsareconsideredinthethirdschemebydividingthefaceimageintogridsandmatchingthefeaturesofcorrespondinggrids.

Sivicetal.hasproposedapersonretrievalsystemin[28]thatrepresentseachfaceimageasacollectionofoverlappinglocalSIFTdescriptorsplacedattheﬁvefacialfeaturelocations(eyes,mouth,nose,andmiddlepointbetweentheeyes).Theyﬁrstusetrackingtoassociatefacesintoface-trackswithinashottoobtainmultipleexemplarsofthesameperson.Then,theyrepresenteachface-trackwithahistogramofprecomputedface-featureexemplars.Thishistogramisusedformatchingtheface-tracks;henceretrievingsetsoffacesacrossshots.

2.4OntheUseofGraphTheoreticalMethodsinComputerVision

Graphtheoreticalapproacheshaverecentlybeenusedincomputervisionprob-lemsduetotheirrepresentationalpowerandﬂexibility.Theyallowvisionprob-lemstobecastinastrongtheoreticalareaandaccesstothefulldepotofgraphalgorithmsdevelopedincomputerscienceandoperationsresearch.Themostcommongraphtheoreticalproblemsusedincomputervisionincludemaximumﬂow,minimumspanningtree,maximumclique,shortestpath,maximalcommonsubtree/subgraph,graphpartitioning,graphindexing,graphmatching,etc.[18].Graphpartitioningalgorithmsthathavebeenusedin[22,30,45],targetthetwotypicalapplicationsofcomputervision:imagesegmentationandperceptualgrouping.Theyaddresstheproblemofmakingcutsinaweightedgraphaccordingtoanappropriateminimumweightcriterion.Intheseworks,dataelements(i.e.imagepixelpoints)correspondtotheverticesinthegraph,andsimilaritybetweenanytwoverticescorrespondtotheedgeweightbetweenthosevertices.In[7],theproblemofcontentbasedimageretrievalhasbeensolvedbyagraphmatchingscheme.Themainideausedwastorepresentanimagequeryasanattributedrelationgraph,andselectasmallnumberofmodelimagegraphsthataresimilar

CHAPTER2.BACKGROUND20

tothequeryimagegraph.

[3]hasproposedagraphtheoreticclusteringmethodforimagegroupingandretrieval.Themotivationoftheworkwasthataneﬃcientretrievalalgorithmshouldbeabletoretrieveimagesthatarenotonlyclose(similar)tothequeryimagebutalsoclosetoeachother.However,mostoftheexistingfeatureextrac-tionalgorithmscannotalwaysmapvisuallysimilarimagestonearbylocationsinthefeaturespace.Henceintheretrievalstep,itisoftentoretrieveirrelevantimages(ornottoretrieverelevantimages)simplybecausetheyareclosetothequeryimage(orabitfarawayfromthequeryimage).Inthiscontext,theyretrievebestNmatchesforaqueryimage,andbestNmatchesofeachoftheretrievedimages.Agraphisconstructedwithallthoseretrievedimages,inwhichnodescorrespondstotheimagesandedgeweightscorrespondtothesimilarities.Then,theretrievalproblemistransformedintoandsolvedbytheproblemofﬁndingthesetofnodesinthegraph,thatarenotasdenseasmajorcliquesbutarecompactenoughwithinuserspeciﬁedthresholds.

Chapter3

GraphBasedPersonFindingApproach

3.1

Overview

Itislikelythatinthenews,apersonwillappearwhenhis/hernameismentioned.Followingupthiscue,westartsearchforapersonbyﬁrstlookingforthenameofthequerypersonandlimitoursearchspacetotheimagesthatareassociatedwiththatname.Althoughtheremightbethefacesofotherpeopleornon-faceimagesinthislimitedsearchspace,mostlyquerypersonwillbetheonethatappearsmorethananyotherindividual.Visually,facesofaparticularpersontendtobemoresimilartoeachotherthantofacesofotherpeople.Basedontheseassumptions,wetransformtheproblemtoagraphproblem,inwhichthenodescorrespondtofacesandtheedgescorrespondtothesimilaritiesbetweenfaces,andweseektoﬁndthelargestdensestcomponentinthisgraph.Hence,ifwecandeﬁneasimilaritymeasureamongthefacesinthelimitedsearchspaceandrepresentthesimilaritiesinagraphstructure,thentheproblemofﬁndingthemostsimilarfacescorrespondingtotheinstancesofqueryname’sfacecanbetackledbyﬁndingthedensestcomponentinthegraph.

Inthefollowingthreesections,weﬁrstexplainthestepsofthegraphbased

CHAPTER3.GRAPHBASEDPERSONFINDINGAPPROACH22

personﬁndingapproach:integratingnamesandfacestolimitthesearchspace,deﬁningasimilaritymeasurebetweenfacestoconstructthesimilaritygraph,andthegreedygraphalgorithmtoﬁndthelargestdensestcomponentofthegraph.Followingthosesections,wewillexplaintheuseofpersonﬁndingapproachforautomaticanchorpersondetectioninnewsvideos.Then,inthelastsectionwedeﬁnetwomethodstousetheoutputofthethreesteppersonﬁndingschemelaterinrecognizingnewfaces.

3.2IntegratingNamesandFaces

Theﬁrststepofouralgorithminvolvesintegratingnameandfaceinformation.Inthisstep,weusethenameinformationtolimitoursearchspacetotheimagesaroundwhichthenameofthequerypersonappears.Tothisend,welookforthenameofthequerypersoninthecaptionorinthespeechtranscript;andchoosetheimagesthatareassociatedwiththequeryname.

Ontheweb,thenewsphotographsappearwiththecaptions.However,therecanbemorethanonefaceinaphotographandmorethanonenameinthecor-respondingcaption.Therefore,itisnotknownwhichfacegoeswithwhichname.Usingtheassumptionthatapersonislikelytoappearinaphotographwhenhis/hernameismentionedinthecaption,wereducethefacesetforaqueriedpersonbyonlychoosingthephotographsthatincludethenameofthatpersonintheassociatedcaption.However,aperson’snamecanappearindiﬀerentforms.Forexample,thenamesGeorgeW.Bush,PresidentBush,U.S.President,andPresidentGeorgeBush,allcorrespondtothesameperson.Wemergedthesetofdiﬀerentnamesusedforthesamepersontoﬁndallfacesassociatedwithalldiﬀerentnamesofthesameperson.Adetailedlistofdiﬀerentformsofnamescorrespondingtoeachquerypersonusedinourexperimentsisgivenintheap-pendix.

Forthenewsvideos,theprobabilityofapersonappearingonthescreenishighwhenhis/hernameismentionedinthespeechtranscripttext.Thus,looking

CHAPTER3.GRAPHBASEDPERSONFINDINGAPPROACH23

fortheshotsinwhichthenameofthequerynameismentionedisagoodplacetostartsearchoverpeople.However,itisproblematicsincetherecanbeatimeshiftbetweentheappearanceofthepersonvisuallyandtheappearanceofhis/hername.

Recently,ithasbeenshowedthatthefrequencyofaperson’svisualappear-ancewithrespecttotheoccurrenceofhis/hernamecanbeassumedtohaveaGaussiandistribution[52].Weusethesameideaandsearchfortherangewherethefaceislikelytoappearrelativetothename.Asweexperimentedon“Clinton”query,weseethattakingthetenprecedingandthetensucceedingshotstogetherwiththeshotwherethenameismentionedisagoodapproximationtoﬁndmostoftherelevantfaces(seeFigure4.9).However,thenumberoffacesinthisrange(whichwerefertoas[-10,10])canstillbelargecomparedtotheinstancesofthequeryname.Aswewillexplainintheexperiments,itisseenthattakingonlyoneprecedingandtwofollowingshots(whichwerefertoas[-1,2])isalsoagoodchoice.

Anotherprobleminnewsvideosisthatusuallythefacesoftheanchorpersonappeararoundaname.Forsolutiontothisproblem,weuseananchorpersondetectionmethodbasedonourgraphbasedapproachaswewillbeexplainedlater.

Integratingnamesandfacesproducesbetterretrievalperformancescomparedtosolelytext-basedmethods.However,theresultingsetmaystillcontainmanyfalsefacesduetothefollowingreasons:thequerypersonmaynotappearvisuallyevenifhis/hernameismentioned,theremaybeotherpeopleinthesamestorythatalsoappearvisuallytogetherwiththequeryperson,theremaybenon-faceimagesreturnedbythefacedetectionalgorithmused.However,thefacesofthequerypersonarelikelytobethemostfrequentlyappearingonesthananyotherpersoninthesamespace.Eveniftheexpressionsorposesvary,diﬀerentappearancesofthefaceofthesamepersontendtobemoresimilartoeachotherthantothefacesofothers.Inthefollowingsections,weexplainourstrategytoincreaseretrievalperformancebyﬁndingthecorrectfacesbyusingvisualsimilarities.

CHAPTER3.GRAPHBASEDPERSONFINDINGAPPROACH24

3.3ConstructingSimilarityGraphofFaces

WerepresentthefaceswiththeinterestpointsextractedfromtheimagesusingtheSIFToperator[35].Lowe’sSIFToperator[35],haverecentlybeenshowntobesuccessfulinrecognizingobjects[39,26]andfaces,[28].TheSIFTtech-niqueconsistsoffourmainsteps:1.Scale-spaceextremadetection,2.Keypointlocalization,3.OrientationAssignment,4.Keypointdescriptor.

Intheﬁrststep,potentialinterestpointsareextractedbylookingforallscalesandimagelocations.Ascalespaceoftheimageisconstructedﬁrstbyconvolvingtheimagewithvariable-scaleGaussianfunction.LettheinputimagebeI(x,y)then,theGaussian-blurredimageL(x,y,σ)canberepresentedby:

L(x,y,σ)=G(x,y,σ)∗I(x,y),

where,

y21−x2+2G(x,y,σ)=e2σ.22πσ

Diﬀerence-ofGaussianfunction(DoG)isusedtodetectstablekeypointloca-tionsinthisscalespace.DoGoftwonearbyscalesseperatedbyconstantkisgivebby:

DoG(x,y,σ)=(G(x,y,kσ)−G(x,y,σ))∗I(x,y)

=L(x,y,kσ)−L(x,y,σ).

Localmaximaand/orminimaofDOFgivesthecandidatekeypointlocation,andiscomputedbycomparingeachsamplepointwithboththeeightneighborpointsonthesamescaleandthenineneighborpointsontwoimagesintheneighborscale.

Inthesecondstep,eachcandidatekeypointlocationisﬁttoanearbydatatodetermineitslocation,scaleandratioofprincipalcurvatures.First,athreshold

CHAPTER3.GRAPHBASEDPERSONFINDINGAPPROACH25

onminimumcontrastisappliedtoremovetheunstableextremawithlowcon-trast.Then,asecondthresholdisappliedontheratioofprinciplecurvaturestoeliminatethestrongresponsesofdiﬀerence-of-Gaussianfunctionalongedgeswhicharepoorlydetermined.

Anorientationisassignedtothekeypointlocationsinthethirdstep,basedonlocalimagegradientdirections.Anorientationhistogramisformedfromgra-dientorientationsofthesamplepointsaroundakeypointbyprecomputingtheirpixeldiﬀerencesinascaleinvariantmanner.Peaksinthehistogramdenominatedominantdirectionsoflocalgradients;thusthe80percentofthehighestpeakinthehistogramisusedtoassigntheorientationofakeypoint.

Inthelaststep,adescriptorforeachkeypointiscomputedfromimagegra-dients.GradientsofpointswithinanarrayaroundthekeypointareweightedbyaGaussianwindowanditscontentissummarizedintoadescriptorarray,usingorientationhistograms.Thesefeaturesarethennormalizedtounitlengthandathresholdisappliedtothisunitvectorvaluestoreducetheeﬀectsofilluminationchange.

Weﬁrstuseaminimumdistancemetrictoﬁndallmatchingpointsandthenremovethefalsematchesbyaddingsomeconstraints.Foreachpairoffaces,theinterestpointsontheﬁrstfacearecomparedwiththeinterestpointsonthesecondfaceandthepointshavingtheleastEuclideandistanceareassumedtobethecorrectmatches.However,amongthesetherecanbemanyfalsematchesaswell(seeFigure3.1).

Inordertoeliminatethefalsematches,weapplytwoconstraints:thege-ometricalconstraintandtheuniquematchconstraints.Geometricalconstraintexpectsthematchingpointstoappeararoundsimilarpositionsonthefacewhenthenormalizedpositionsareconsidered.Thematcheswhoseinterestpointsdonotfallinclosepositionsonthefaceareeliminated.Uniquematchconstraintensuresthateachpointmatchestoonlyasinglepointbyeliminatingmultiplematchestoonepointandalsobyremovingone-waymatches.Inthenexttwosubsections,wegivethedetailsofhowthoseconstraintsareapplied.

CHAPTER3.GRAPHBASEDPERSONFINDINGAPPROACH26

3.3.1GeometricalConstraint

Weexpectthatmatchingpointswillbefoundaroundsimilarpositionsontheface.Forexample,thelefteyeusuallyresidesaroundthemiddle-leftofaface,evenindiﬀerentposes.Thisassumptionpresumesthatthematchingpairofpointswillbeincloseproximitywhenthenormalizedcoordinates(therelativepositionofthepointsonthefaces)areconsidered.

Toeliminatefalsematcheswhicharedistantfromeachother,weapplyageometricalconstraint.Forthispurpose,werandomlyselectedasetofimagesof10people(5facesforeachperson).Then,wemanuallyassignedtrueandfalsematchesforeachcomparisonandusedthemastrainingsamplestoberunonaquadraticBayesnormalclassiﬁer([43,51])toclassifyamatchedpointastrueorfalseaccordingtoitsgeometricaldistance.Thegeometricaldistance

√th

correspondingtotheiassignmentreferstoX2+Y2where

locX(i)locX(match(i))

−,

sizeX(image1)sizeX(image2)locY(i)locY(match(i))

−,

sizeY(image1)sizeY(image2)

andlocXandlocYholdXandYcoordinatesofthefeaturepointsintheimages,sizeXandsizeYholdXandYsizesoftheimagesandmatch(i)corre-spondstothematchedkeypointinthesecondimageoftheithfeaturepointintheﬁrstimage.

InFigure3.1,matchesbeforeandaftertheapplicationofthisgeometricalconstraintareshownforanexamplefacepair.Mostofthefalsematchesareeliminatedwhenthepointsthatarefarawayfromeachotherareremoved.Therelativeanglebetweenthepointscouldalsobeusedasageometriccon-straint.However,sincethecloserpointscouldhaveverylargeanglediﬀerences,itisnotreliable.

CHAPTER3.GRAPHBASEDPERSONFINDINGAPPROACH27

Figure3.1:Theﬁrstimageontheleftshowsallthefeaturepointsandtheirmatchesbasedontheminimumdistance.Thesecondimageontherightshowsthematchesthatareassignedastrueaftertheapplicationofgeometricalconstraints.

3.3.2UniqueMatchConstraints

Aftereliminatingthepointsthatdonotsatisfythegeometricalconstraints,therecanstillbesomefalsematches.Usually,thefalsematchesareduetomultipleassignmentswhichexistwhenmorethanonepoint(e.g,A1andA2)areassignedtoasinglepoint(e.g,B1)intheotherimage,ortoonewayassignmentswhichexistwhenapointA1isassignedtoapointB1ontheotherimagewhilethepointB1isassignedtoanotherpointA2ornotassignedtoanypoint(Figure3.2).Thesefalsematchescanbeeliminatedwiththeapplicationofanotherconstraint,namelytheuniquematchconstraint,whichguaranteesthateachassignmentfromanimageAtoanotherimageBwillhaveacorrespondingassignmentfromimageBtoimageA.

Thefalsematchesduetomultipleassignmentsareeliminatedbychoosingthematchwiththeminimumdistance.Thefalsematchesduetoonewayassignmentsareeliminatedbyremovingthelinkswhichdonothaveanycorrespondingas-signmentfromtheotherside.AnexampleshowingthematchesbeforeandafterapplyingtheuniquematchconstraintsaregiveninFigure3.3andinFigure3.4.

CHAPTER3.GRAPHBASEDPERSONFINDINGAPPROACH28

Figure3.2:ForapairoffacesAandB,letA1andA2betwopointsonA;andB1isapointonBwiththearrowsshowingthematchesandtheirdirection.OntheleftisamultipleassignmentwherebothpointsA1andA2onAmatchB1onB.Insuchacase,thematchbetweenA2andB2iseliminated.OntherightisaonewaymatchwhereB1isamatchforA1,whereasB1matchesanotherpointA2onA.ThematchofA1toB1iseliminated.ThematchofB1toA2remainsthesameifB1isalsoamatchforA2;otherwiseitiseliminated.

Figure3.3:Anexampleforuniquematchconstraint.Matchesfromthelefttotherightimageareshownbyred,dashedlines,whereasmatchesfromrighttoleftareshownbyyellowlines.Theleftimageshowsthematchesassignedafterapplyinggeometricalconstraints,butbeforeapplyingtheuniquematchconstraints.Therightimageshowstheremainingmatchesafterapplyingtheuniquematchconstraints.

3.3.3SimilarityGraphConstruction

Afterapplyingtheconstraintsandassumingthattheremainingmatchesaretruematches,wedeﬁnethedistancebetweenthetwofacesAandBastheaveragevalueofallmatches.

󰀂N

dist(A,B)=

i=1

D(i)

whereNisthenumberoftruematchesandD(i)istheEuclideandistance

CHAPTER3.GRAPHBASEDPERSONFINDINGAPPROACH29

Figure3.4:Examplesformatchingpoints.Notethat,evenforfaceswithdiﬀerentsize,poseorexpressionsthemethodsuccessfullyﬁndsthecorrespondingpoints.betweentheSIFTdescriptorsofthetwopointsfortheithmatch.

Asimilaritygraphforallfacesinthesearchspaceisthenconstructedusingthesedistances.WecanrepresentthegraphasamatrixasinFigure3.5.Thematrixissymmetricandthevaluesonthediagonalareallzero.Foramoreclearvisualrepresentation,thedistancesforthefacescorrespondingtothepersonweareseekingareshowntogether.Clearly,thesefacesaremoresimilartoeachotherthantotheothers.Ourgoalistoﬁndthissubsetwhichwillcorrespondtothedensestcomponentinthegraphstructure.

3.4GreedyGraphAlgorithmforFindingtheDensestComponent

Intheconstructedsimilaritygraph,facesrepresentthenodesandthedistancesbetweenthefacesrepresenttheedgeweights.Weassumethat,inthisgraphthenodesofaparticularpersonwillbeclosetoeachother(highlyconnected)anddistantfromtheothernodes(weaklyconnected).Hence,theproblemcanbetransformedintoﬁndingthedensestsubgraph(component)intheentiregraph.ToﬁndthedensestcomponentweadaptthemethodproposedbyCharikar[15]wherethedensityofsubsetSofagraphGisdeﬁnedas

CHAPTER3.GRAPHBASEDPERSONFINDINGAPPROACH30

Figure3.5:Dissimilaritymatrixfor200imagesinthesearchspaceforthenameHansBlix.Inthissearchspace,97oftheimagesaretrueHansBliximages,andtheremaining103arenot.Forvisualization,the97trueBliximagesareputonthetopleftofthematrix.Darkcolorscorrespondtolargersimilarityvalues.

f(S)=

|E(S)|

,|S|

whereE(S)={i,j∈E:i∈S,j∈S}andEisthesetofalledgesinG.Inotherwords,E(S)isthesetofedgesinducedbysubsetS.ThesubsetSthatmaximizesf(S)isdeﬁnedasthedensestcomponent.

OurgoalistoﬁndthesubgraphSwiththelargestaveragedegreethatisthesubgraphwiththemaximumdensity.Initially,thealgorithmpresentedin[15]startswiththeentiregraphGandsetsS=G.Then,ineachstep,thevertexwiththeminimumdegreeisremovedfromS.Thealgorithmalsocomputesthevalueoff(S)foreachstepandcontinuesuntilthesetSisempty.Finally,thesetS,thathasmaximumf(S)value,isreturnedasthedensestcomponentofthegraph.

CHAPTER3.GRAPHBASEDPERSONFINDINGAPPROACH31

Inordertoapplytheabovealgorithmtotheconstructedsimilaritygraph,weneedtoconvertitintoabinaryform,sincethealgorithmdescribedaboveonlyworkswellforbinarygraphs.Thus,beforeapplyingit,weconvertouroriginaldissimilarityvaluesintoabinaryform,inwhich0indicatesnoedgeand1indicatesanedgebetweentwonodes.Thisconversioniscarriedoutbyapplyingathresholdonthedistancebetweenthenodes.Thisthresholdalsoconnoteswhatwedeﬁneasnear-byand/orremote.AnexampleofsuchaconversionisgiveninFigure3.6.Intheexample,assumethat0.65isdeﬁnedasourproximitythreshold.Inotherwords,ifthedistancebetweentwonodesislessthanorequalto0.65thenthesetwonodesarenear-by;thereforeweputanedgebetweenthem.Otherwise,noedgeismaintainedbetweenthesenodes,sincetheyarefarawayfromeachother.

3.5AnchorpersonDetectionandRemovalforNewsVideos

Whenwelookattheshotswherethequerynameismentionedinthespeechtranscript,itislikelythattheanchorperson/reportermightbeintroducingorwrappingupastory,withtheprecedingorsucceedingshotsbeingrelevant,butnotthecurrentone.Therefore,whentheshotsincludingthequerynameareselected,thefacesoftheanchorpersonwillappearfrequentlymakingouras-sumptionthatthemostfrequentfacewillcorrespondtothequerynamewrong.Hence,itishighlyprobablethattheanchorpersonwillbereturnedasthedensestcomponentbythepersonﬁndingalgorithm(seeFigure3.7).Thesolutionistodetectandremovetheanchorpersonbeforeapplyingthealgorithm

In[17],asupervisedmethodforanchorpersondetectionisproposed.Theyintegratecolorandfaceinformationtogetherwithspeaker-idextractedfromtheaudio.However,thismethodhassomedisadvantages.Firstofall,ithighlydependsonthespeaker-id,andrequirestheanalysisofaudiodata.Thecolorinformationisusefultocapturethecharacteristicsofstudiosettingswheretheanchorpersonislikelytoappear.But,whentheanchorpersonreportsfroman-otherenvironmentthisassumptionfails.Finally,themethoddependsonthe

CHAPTER3.GRAPHBASEDPERSONFINDINGAPPROACH32

factthatthefacesofanchorpeopleappearinlargesizesandaroundsomespeciﬁcpositions,butagaintheremaybecaseswherethisisnotthecase.

Inthisstudy,weusethegraphbasedmethodtoﬁndtheanchorpeopleinanunsupervisedway.Theideaisbasedonthefactthat,theanchorpeopleareusuallythemostfrequentlyappearingpeopleinbroadcastnewsvideos.Fordiﬀerentdaystheremaybediﬀerentanchorpeoplereporting,butgenerallythereisasingleanchorpersonforeachday.Hence,weapplythedensestcomponentbasedmethodtoeachnewsvideoseparately,toﬁndthepeopleappearingmostfrequently,whichcorrespondtotheanchorpeople.

3.6DynamicFaceRecognition

Theoverallschemeexplainedintheprevioussectionsreturnsasetofimagesclas-siﬁedasthequeryperson(densestcomponent)andtherestasothers(outliers).Also,thegraphalgorithmworksonthewholesetofimagesinthesearchspaceofthequeryperson.Thus,whenanewfaceisencountered,thealgorithmneedstobere-runonthewholesettolearnthelabelofthenewface–querypersonoroutlier.However,sincetheschemereturnstheclassiﬁedimages,thisresultcanbeusedasamodeltorecognizenewfacesdynamicallyandcheckifitbelongtothefacesofthequeriedperson.Moreover,thistaskcanbeachievedwithoutanysupervision,sincetheschemeprovidesusthetrainingdatalabeledautomatically.Inthenexttwosubsections,weexplainhowtheoutputofthepersonﬁndingapproachcanbeusedinrecognizingnewfaces.Wemodelthereturnedsolutionintwowaystolearnthethresholdsfor:averagedegreeandaveragedistance.

3.6.1DegreeModeling

Asexplainedin3.4,thegreedydensestcomponentalgorithmworksiterativelybyremovingonenodefromthegraphuntilthereisonelastnodeleftinthegraph.Averagedensityofeachsubgraphiscomputedineachiterationand

CHAPTER3.GRAPHBASEDPERSONFINDINGAPPROACH33

ﬁnallythesubgraphwiththelargestaveragedensityisassignedasthedensestcomponent.Amongtheseiterations,thereisonelastnoderemovedfromthecurrentsubgraphthatresultsinthedensestcomponentinthenextiteration.Thislastnodecanbethoughtofasthebreakingpoint,andindicateanevidenceforthemaximumnumberoftotalconnections(edges)fromanoutliernodetoallthenodesinthedensestcomponent.Thistotalnumber–degreeofthenearestoutliertoallthefacesrecognizedasthequeryperson–canusedasathresholdinfurtherrecognition.Whenanewfaceisencountered,itsdegreetoallthefacesinthedensestcomponentiscomputedﬁrst.Then,thefaceislabeledasthequerypersonifitsdegreeisgreaterthanthefounddegreethreshold.

3.6.2DistanceModeling

Inthismethod,averagedistanceoftrue-trueandfalse-truematchesareused.Foreachnodeinthegraph,itsaveragedistancetoallthenodesinthedensestcomponent–hencethefacesofthequeryperson–arecomputed.Ifanodewasinthedensestcomponent,thenitsaveragedistanceislabeledasatrue-truematchdistance,elseafalse-truematchdistance.ThesedistancesarethentrainedwiththequadraticBayesnormalclassiﬁer([43,51])tolearntheaveragedistancethresholdandclassifynewtestimagesbasedonitsaveragedistancetotrueimagesinthetrainingset.

CHAPTER3.GRAPHBASEDPERSONFINDINGAPPROACH34

Figure3.6:Exampleofconvertingaweightedgraphtoabinarygraph.Nodesandtheirdistancesaregivenintheﬁrstimage.Theresultinggraphafterapplying0.65astheproximitythresholdisgiveninthesecondimage.Boldedgesaretheedgesthatremainafterconversion.Theﬁnaldensestcomponentofthisgraphiscircledinthelastimage.

CHAPTER3.GRAPHBASEDPERSONFINDINGAPPROACH35

Figure3.7:Anchorpersonprobleminnewsvideos.Afterintegratingnamesandfacesforaqueryperson,itishighlyprobablethateitheroneoftheanchorpersonswillbereturnedasthedensestcomponentofthelimitedsearchspace.Theﬁrstﬁgureontheleftcorrespondstoasample

Chapter4Experiments

4.1

DataSets

Themethodproposedinthisthesiswastestedontwodiﬀerentdatasets:newsphotographsonthewebandbroadcastnewsvideosontelevision.Inthenexttwosubsections,webrieﬂydescribebothdatasetsusedinourexperiments.

4.1.1NewsPhotographs

ThedatasetconstructedbyBergetal.originallyconsistsofabouthalfamillioncaptionednewsimagescollectedfromYahoo!NewsontheWeb.Afterapplyingafacedetectionalgorithmandprocessingtheresultingfaces,theywereleftwithatotalof30,281detectedfaces[8].Eachimageinthissetisassociatedwithasetofnames.Atotalof13,292diﬀerentnamesareusedforassociation.Howevermorethanhalf(9,609)ofthemareusedonlyonceortwice.Also,aswementionedpreviously,aparticularpersonmaybecalledbydiﬀerentnames.Forexample,thenamesusedforGeorgeWBushandtheirfrequencyare:GeorgeW(1485);W.Bush(1462);GeorgeW.Bush(1454);PresidentGeorgeW(1443);PresidentBush(905);U.S.President(722);PresidentGeorgeBush(44);PresidentBushs(2);PresidentGeorgeWBush(2);GeorgeWBush(2).

Wemergethesetof

CHAPTER4.EXPERIMENTS37

diﬀerentnamesusedforthesamepersonandthentaketheintersectiontoﬁndfacesassociatedwithdiﬀerentnamesofthesameperson.Adetailedlistofdiﬀer-entformsofnamescorrespondingtoeachquerypersonusedinourexperimentsisgivenintheappendix.

Generally,thenumberoffacesintheresultingsetislessthanthenumberofallnamessinceacaptionmayincludemorethanoneinstanceofthereferredname.Forexample,forBushthenumberoffacesis2,849whilethetotalnumberofallreferrednamesis7,528.Intheexperiments,thetop23peopleappearingwiththehighestfrequencies(morethan200times)areused.Figure4.1showsthetotalnumberoffacesassociatedwiththegivennameandthenumberofcorrectfacesforthe23peopleusedintheexperiments.

4.1.2NewsVideos

Theseconddatasetusedintheexperimentsisthebroadcastnewsvideospro-videdbyNISTforTRECVIDvideoretrievalevaluationcompetition2004[1].Itconsistsof229movies(30minuteseach)fromABCandCNNnews.Theshotboundariesandthekey-framesareprovidedbyNIST.SpeechtranscriptsextractedbyLIMSI[21]areusedtoobtaintheassociatedtextforeachshot.Fortheexperiments,wechoose5people,namelyBillClinton,BenjaminNe-tanyahu,SamDonaldson,SaddamHusseinandBorisYeltsin.Inthespeechtranscripttext,theirnamesappear991,51,100,149and78timesrespectively.ThefacedetectionalgorithmprovidedbyMikolajcyzk[40]isusedtoextractfacesfromkey-frames.Duetohighnoiselevelsandlowimageresolutionquality,thefacedetectorproducesmanyfalsealarms.Onrandomlyselectedtenvideos,in2942images,1395regionsaredetectedasfacesbutonly790ofthemarerealfacesand580facesaremissed.Intotal,31,724facesaredetectedoverthewholedataset.

CHAPTER4.EXPERIMENTS38

4.2EvaluationCriteria

Forevaluation,wegivetheexperimentalresultsbasedonrecallandprecisionvalues.Thesevaluesarecomputedasfollows:LetNbethetotalnumberoffacesreturnedbythealgorithmasthefacesofthequeryperson.AmongthoseN,letnbethetotalnumberoffacesthatreallybelongtothatperson.Then,theprecisionvalueofthisresultis:

precison=

n.N

Ifthereisatotalofmfacesinthewholedatasetthatbelongtothequeryperson,thentherecallvalueoftheresultis:

recall=

n.m

Weshouldalsodenotethatasthebaseline,weusetheimagesreturnedbythefacedetector.Hence,m(groundtruthofthequeryperson)iscomputedamongthosedetectedfaces.

Afterﬁndingtherecallandprecisionvaluesforeachquerypersonindividually,weﬁnallycomputetheweightedresultsforaveragerecallandprecisionvalues.Whatwemeanwithweightedisthatrecalland/orprecisionvalueofeachpersonisweightedbythenumberofimagesthatappearinhis/herlimitedsearchspace.Letrecall(i)betherecallvaluefortheithperson,andS(i)bethetotalnumberofimagesinitslimitedsearchspace.Then,weightedaveragerecallis:

󰀂

S(i)∗recall(i)󰀂wavgrecall=.S(i)

Inthefollowingtwosubsections,wegivetheresultsoftheseexperimentsonbothsetsseparately:

CHAPTER4.EXPERIMENTS39

4.3

4.3.1

ExperimentalResultsonNewsPhotographs

MatchingPoints

Astheﬁrststep,thepointshavingtheminimumdistanceaccordingtotheirSIFTdescriptorsaredeﬁnedasthematchingpoints.Thesepointsarefurthereliminatedusingthetwoconstraints.Afterthiseliminationprocess,73%ofallpossibletruematchesarekeptandweloseonly27%oftruematches.Amongtheseassignments,weachievedacorrectmatchingrateof72%.

4.3.2GraphApproach

Thesuccessofouralgorithmvarieswiththethresholdthatischosenwhilecon-vertingtheweighteddissimilaritygraphtoabinaryone.Forthenewspho-tographsdataset,averagerecallandprecisionvaluesbyvaryingthethresholdbetween0.55and0.65isgiveninFigure4.2.Basedonthesevalues,thethreshold0.575ischosentorepresenttherecallandprecisionvaluesforeachperson.Thethreshold0.575ischosentorepresenttherecallandprecisionvaluesforeachpersonThesevaluesaregiveninFigure4.3forthisthreshold.Averageprecisionvalueisobtainedas48%forthebaselinemethodwhichassumesthatallthefacesappearingaroundthenameiscorrect.Withtheproposed,methodweachieved68%recalland71%precisionvaluesontheaverage.Themethodcanachieveupto84%recall-asforGrayDavis-and100%precision-asforJohnAshcroft,HugoChavez,JiangZeminandAbdullahGul.Wehadinitiallyassumedthat,afterassociatingnames,truefacesofthequeriedpersonappearmorethananyotherpersoninthesearchspace.However,whenthisisnotthecase,thealgorithmgivesbadretrievalresults.Forexample,thereisatotalof913imagesassociatedwithnameSaddamHussein,butonly74ofthemaretrueSaddamHusseinimageswhile179ofthemareGeorgeBushimages.

CHAPTER4.EXPERIMENTS40

Toshowthatoursystemworksalsoonindividualsappearinginasmallnum-berofcaptions,weperformedexperimentson10peopleappearinglessthan35timesandobtainedaveragerecallandprecisionvalues85%and66%.Asanotherexperiment,wechangedthenumberofinstancesofafacebyremovingsomeofthecorrectfacesorbyaddingsomeincorrectfaces.For4peoplehavingaround200instancesandsimilarnumberoftrueandfalseimages(i)weremoved50oftrueimagesoffromeachoftheirsearchspace,(ii)weadded100falseimages.Originally,averagerecallandprecisionvalueswere63%and95%.Weobtained59%recallan89%precisionafter(i),and58%recalland70%precisionafter(ii).Althoughtheprecisionissomewhataﬀected,resultsarestillacceptable.SomesampleimagesretrievedandnotretrievedforthreepeoplefromthetestsetareshowninFigure4.4.

4.3.3OnlineRecognition

Therecognitionmethodsexplainedin3.6aretestedonthenewsphotographsdataset.Intheseexperiments,Kpercentageoftheimagesinthesearchspaceofaquerypersonisselectedastheheld-outsetandthegraphalgorithmisappliedontheremainingimagestogetthemodel.AverageresultsofthedegreemodelingmethodaregiveninFigure4.5andinTable4.1.AndtheaverageresultsofthedistancemodelingmethodaregiveninFigure4.6andinTable4.2.ForK=10,therecallandprecisionvaluesforeach23personisgiveninFigures4.7and4.8respectively.fordistancemodelingtechnique.

4.4

4.4.1

ExperimentalResultsonNewsVideos

IntegratingFacesandNames

Forbetterunderstandingofthedistributions,weplotthefrequencyoffacesrelativetothepositionofthenamesfortheﬁvepeoplethatwehavechosenfor

CHAPTER4.EXPERIMENTS41

Table4.1:Recognitionratesofdegreemodelingcentoftheimagesareusedastheheld-outset.

K1020304050modelrecall0.680.680.670.660.66precision0.710.710.710.700.70testsetrecall0.690.700.700.700.69precision0.710.710.700.700.70fordiﬀerentKvalues.(Kper6070800.610.680.700.66900.580.640.760.620.650.630.700.690.690.690.690.68Table4.2:RecognitionratesofdistancemodelingfordiﬀerentKvalues.(Kpercentoftheimagesareusedastheheld-outset.

K102030405060708090modelrecall0.690.680.670.670.660.650.630.610.58precision0.710.710.700.700.700.700.690.680.65testsetrecall0.720.700.700.680.670.670.640.620.57precision0.720.720.720.720.720.720.710.710.69ourexperimentsinFigure4.10.Itisseenthattakingonlyoneprecedingandtwofollowingshots(whichwerefertoas[-1,2])isalsoagoodchoice.Table4.3showsthat,mostofthecorrectfacesfallintothisselectedrangebyremovingmanyfalsealarms.

Table4.3:Numberoffacescorrespondingtothequerynameovertotalnumberoffacesintherange[-10,10]and[-1,2].

RangeClintonNetanyahuDonaldsonSaddamYeltsin[-10,10]213/69059/383137/119718/100421/488[−1,2]160/24576/114102/33014/33219/157

CHAPTER4.EXPERIMENTS42

4.4.2AnchorDetection

Weappliedthedensestcomponentbasedmethodtoeachnewsvideoseparately,toﬁndthepeopleappearingmostfrequently,whichcorrespondtotheanchorpeople.Werunthealgorithmon229videosinourtestset,andobtainedaveragerecallandprecisionvaluesas0.90and0.85respectively.ImagesthataredetectedasanchorpersonintendiﬀerentvideosaregiveninFigure4.11.

4.4.3GraphApproach

Inordertodetermineareasonablethresholdusedinconvertingtheweightedsim-ilaritygraphtoabinaryoneforthenewsvideos,werandomlyselected10videosandrecordedrecall-precisionvaluesofdiﬀerentthresholdsforanchorpersonde-tection.ThesevaluesareplottedinFigure4.12.Furtherinourexperiments,weselectthepointmarkedwithacrossintherecall-precisioncurve,whichcorre-spondstothresholdof0.6.Thesamethresholdisusedbothforanchorpersondetectionandforpersonqueries.

Afterselectingtherangewherethefacesmayappearweapplythedensestcomponentalgorithmtoﬁndthefacescorrespondingtothequeryname.WehaverecordedthenumberoftruefacesofthequerynameandtotalnumberofimagesretrievedasinTable4.4.Theﬁrstcolumnofthetablereferstototalnumberoftrueimagesretrievedvs.totalnumberoftrueimagesretrievedbyusingonlythespeechtranscripts-selectingtheshotswithininterval[-1,2].Thenumbersafterremovingthedetectedanchorpeoplebythealgorithmfromthetext-onlyresultsaregiveninthesecondcolumn.Andthelastcolumnisforapplyingthealgorithmtothisset,fromwhichtheanchorpeopleareremoved.TheprecisionvaluesaregiveninFigure4.14.SomesampleimagesretrievedforeachpersonareshowninFigure4.13.

Ascanbeseenfromtheresults,wekeepmostofthecorrectfaces(especiallyafteranchorpersonremoval),andwegetrejectmanyoftheincorrectfaces.Hencethenumberofimagespresentedtotheuserisdecreased.Also,ourimprovementin

CHAPTER4.EXPERIMENTS43

Table4.4:Numbersinthetableindicatethenumberofcorrectimagesretrieved/totalnumberofimagesretrievedforthequeryname.QuerynameText-onlyAnchorremovedMethodapplied

Clinton160/2457150/1765109/1047

NetanyahuSamDonaldson6/114102/3305/7481/2004/3267/67

Saddam14/33214/2279/110

Yeltsin19/15717/12210/57

precisionvaluesarerelativelyhigh.Averageprecisionofonlytextbasedresultsincreasesby29%afterancherpersonremoval,andby152%afterapplyingtheproposedalgorithm.

4.5AMethodforFindingtheGraphThresholdAutomaticaly

Thenormalizedcutmetricpresentedin[45]canbeusedforselectingthethresholdtoconverttheweightedgraphtoabinaryone.LetAbethesetofverticesofacutandVbethesetofallverticesingraphG.ThenthevalueofacutandnormalizedcutofAaredeﬁnedasfollows:

󰀁

u󰀍A,v󰀍V−A

cut(A,V)=w(u,v),

Ncut(A,V)=

cut(A,V)cut(A,V)+,

assoc(A,V)assoc(V−A,V)

wherew(u,v)isafunctionofsimilaritybetweennodesuandv;and

󰀂

assoc(A,V)=u󰀍A,t󰀍Vw(u,t)isthetotalconnectionfromnodesinAtoallnodesinthegraph.

Weobtaindiﬀerentbinarygraphsbychoosingdiﬀerentthresholdintherange0.55and0.65.Thengraphalgorihmisrunonthesegraphsseperatelyandthe

CHAPTER4.EXPERIMENTS44

normalizedcutvalueforeachgraphiscalculatedasdeﬁnedasabove.Amongall,thethresholdsappliedinconstructingthegraphwithminimumNcutvalueisselected.Theoverallweightedrecallandprecisionvaluesareachievedas74.01and68.55respectively.

4.6PerformanceAnalysis

Theperformanceofoursystemismainlybasedoncomputingthesimilarityvaluessincewecompareeachfacewithallotherfacesinthesearchspace.Hence,thetimecomplexityofconstructingthesimilaritymatrixisO(N∗N),whereNisthetotalnumberofimagesinthelimitedsearchspaceofaqueryname.ThetimecomplexityofthegreedygraphalgorithmisO(N);andittakesconstanttimetocheckifanewimagebelongstothequerypersonaftertheresultofthegraphapproachismodeled.

Toformanexample,thesimilaritymatrixofasearchspacewith200picturesisconstructedin9minutesonaPentiumIV3GHzmachinewith2GBmemory;andittakeslessthan1secondtopartitionthisgraphwiththedensestcomponentalgorithm.

CHAPTER4.EXPERIMENTS45

Figure4.1:Namesof23peopleareusedintheexperiments.Thetotalnumberoffacesassociatedwithanameisrepresentedbyredbarsandnumberofcorrectfacesbygreenbars.

CHAPTER4.EXPERIMENTS46

Figure4.2:Recall-precisioncurveof23peopleinthenewsphotographsdataset.Precisionandrecallvalueschangedependingonthethreshold.Weusedthresholdvaluesbetween0.55and0.65toshowtheeﬀect.Thethresholdusedintherestofourexperimentsismarkedwithred.

CHAPTER4.EXPERIMENTS47

Figure4.3:Recallandprecisionvaluesfor23peopleforgraphthresholdvalueof0.575.Bluebarsrepresentrecallandredbarsrepresentprecisionvaluesthatareachievedwiththeproposedmethod.Greenbarsareprecisionvaluesforthebaselinemethod,whichdoesnotusethevisualinformationandretrievesthefaceswhennameappearsinthecaption.

CHAPTER4.EXPERIMENTS48

Figure4.4:Sampleimagesretrieved(ontheleft)andsampleimagesnotretrieved(ontheright)forthequerynames:GeorgeBush(top),ColinPowell(middle),HansBlix(bottom).

Figure4.5:Recallandprecisonvaluesoftheheld-outset(ontheleft)andtheconstructedmodel(ontheright)foronlinerecognitionwithdegreemodelingmethod.

CHAPTER4.EXPERIMENTS49

Figure4.6:Recallandprecisonvaluesoftheheld-outset(ontheleft)andtheconstructedmodel(ontheright)foronlinerecognitionwithdistancemodelingmethod.

Figure4.7:Recallvaluesoftheheld-outsetandtheconstructedmodelforeachpersonforK=10.

CHAPTER4.EXPERIMENTS50

Figure4.8:Precisonvaluesoftheheld-outsetandtheconstructedmodelforeachpersonforK=10.

Figure4.9:TheﬁgureshowsfrequencyofBillClinton’svisualappearancewithrespecttothedistancetotheshotinwhichhisnameismentioned.Left:whenthewholedatasetisconsidered,right:whenthefacesappearingaroundthenamewithintheprecedingandthefollowingtenshotsareconsidered.OverthewholedatasetClintonhas240facesand213ofthemappearintheselectedrange.

CHAPTER4.EXPERIMENTS51

Figure4.10:TherelativepositionofthefacestothenameforBenjaminNe-tanyahu,SamDonaldson,SaddamHussein,andBorisYeltsinrespectively.

Figure4.11:Detectedanchorsfor6diﬀerentvideos.

Figure4.12:Recall-precisionvaluesforrandomlyselected10newsvideosforthresholdvaluesvaryingbetween0.55and0.65.

CHAPTER4.EXPERIMENTS52

Figure4.13:Sampleimagesretrievedforﬁvepersonqueriesinexperiments.EachrowcorrespondstosamplesforClinton,Netanyahu,SamDonaldson,Saddam,Yeltsinqueriesrespectively.

Figure4.14:Precisionsvaluesachievedforﬁvepeopleusedinourtests.

Chapter5Comparison

Therearetwomainissuestobeanalyzedintheoverallpersonﬁndingapproach:deﬁnitionofthesimilaritymeasureingraphconstructionandthedensestcom-ponentalgorithminpartitioningthisgraph.Forcomparisonreasons,weﬁrstanalyzethesetwoissuesinthefollowingtwosections,andthencomparetheresultswiththepreviousapproachonthesamedataset.

5.1BaselineMethod

Theprinciplecomponentanalysis(pca)isawell-knownmethodthathasalsobeenusedinfacerecognitionaseigenfaces[36].Astocomparewithabaselinemethod,wehaveapplieditonthenewsphotographsdata,togiveanideaoftheperformanceofthetraditionalfacerecognitionmethods.Theexperimentsareconductedonthegroundtruthfacesofthetop23peopleusedinexperiments.Foreachperson,(100-K)percentoftheimagesareselectedfortraining.RemainingKpercentoftheimagesarethenclassiﬁedasbeingoneofthesepeopleinthetrainset.ThealgorithmisrunKtimeswithdiﬀerentrandomgroupsofimagesfortestingandtraining.Recognitionrateiscalculatedbydivingthetrulylabeledimagesbytotalnumberofimagestested.TheaveragerecognitionratesofbothtestandtrainsetfordiﬀerentKaregiveninTable5.1andin5.1,whichshows

CHAPTER5.COMPARISON54

oﬀlowratesfortestimages.

Table5.1:RecognitionratesoftheeigenfacemethodforfordiﬀerentKvalues.

K102030405060708090test51.9051.8350.4148.9146.3343.4839.6434.6528.13train98.5698.5798.5898.7498.8098.8999.1299.3099.45Figure5.1:RecognitionratesoftheeigenfacemethodfordiﬀerentKvalues.

5.2FeatureSelectionandSimilarityMatrixConstruction

5.2.1FindingTrueMatchingPoints

Asstatedearlier,wehavenotusedtheoriginalmatchingmetricofSIFT,sinceitdoesnotworkwellforfaces.(SeeFigure5.2forsamplematchesfoundbyusingtheoriginalmatchingmetricandtheproposedmethodareshown.)Toseehowwelltheoriginalmetricperforms,wehaveconstructedthesimilaritygraphwithusingthematchesoftheoriginalmetricandthenappliedthedensestcomponentalgorithm.AsamplesimilaritymatrixcanbeseeninFigure5.3.TheperformanceresultsfordiﬀerentgraphthresholdsareplottedinFigure5.4.Therecallandprecisionvaluesforthesamethresholdinouroriginaltests(0.575)arerecordedas0.91and0.59respectively.

Inasecondexperiment,wehaveappliedtheRansacalgorithm[20]toﬁndthe

CHAPTER5.COMPARISON55

homographymatrixandassigntruematchesamongall.ResultsofthistechniquesaregiveninFigure5.5.Asitcanbeperceivedfromtheresults,aﬃneconstraintsdoesnotworkwellduetodeformabilitypropertyofthefaces.

5.3ExtractingSimilarGroupofFaces

Wecomparethegreedygraphalgorithmforﬁndingthedensestcomponentwithonewell-knownapproach:k-nearestneighborandtheone-classclassiﬁcationmethodsinthefollowingtwosubsections.Alltheexperimentsareconductedonnewsphotographsdataset,wherewehadachieved68%recalland71%preci-sionvaluesontheaverage.

5.3.1k-nnApproach

Inthelastexperimentsamethodsimilartothek-nearestneighborsclassiﬁcationhasbeenused.Foreachfaceinthetestset,weﬁndthedistancesofthatfacetoallthefacesinthetrainingset,andselectthenearestkfaces(k-neighbors).Ifnumberoftruefacesaregreaterthanthenumberoffalseonesinthisk-neighbors,thenthetestfaceisclassiﬁedasatrueface.Thetestsareconductedwithdiﬀerentnumberoftrainingandtestingsets.TheresultsaregiveninTable5.2andinFigure5.6fordiﬀerentK,whereKindicatespercentageoftheimagesareusedfortesting.Theresultsshowthatthegreedydensestcomponentalgorithmoutperformsthek-nnapproach.

5.3.2One-classClassiﬁcation

Givenasetofdataitems,one-classclassiﬁcationmethodsaimtoﬁndatargetclassagainsttheoutliers(??).Inotherwords,givenatestsample,itiseitheracceptedasbelongingtothetargetclassorrejected.Hence,one-classclassiﬁca-tionapproachdiﬀersfromanyothertraditionalmultiortwo-classclassiﬁcation

CHAPTER5.COMPARISON56

Table5.2:RecognitionratesofsupervisedmethodfordiﬀerentKvalues.(Kisthepercentageoftheimagesthatareusedintesting;kisthenumberofneighboursused.

K102030405060708090k(neighbours)=11recall0.530.520.510.490.470.450.410.360.33precision0.360.360.350.340.340.330.320.310.31k(neighbours)=5recall0.610.610.600.580.560.540.510.470.40precision0.380.380.380.380.370.370.360.350.34k(neighbours)=3recall0.680.670.650.640.620.590.560.520.45precision0.410.400.400.400.390.390.380.370.36approachesbyholdingonlytheinformationofthetargetclass.Withthegreedydensestcomponentalgorithm[15],wealsoseektoﬁndonlythenodesbelongingtothedensestcomponent(hencethefacesofthequeryperson)andassumeallothersareoutliers.Inthiscontext,themostsimilarproblemtotheproblemofﬁndingthedensestcomponentofthegraphisone-classclassiﬁcationproblem.Tothisend,wecomparetheone-classclassiﬁcationmethodsin??withthegraphalgorithmusedinthisstudy.

Thesimilaritygraphconstructedasdescribedin??keepsonlythedistancesamongfaces.Hence,theone-classclassiﬁcationmethodscannotbeappliedtothisgraph.Sotocomparethegreedygraphdensestcomponentmethodwithanyofthosemethods,weusedtheBag-Of-Featuresapproachasin[46,33]forgraphconstruction.Then,weappliedboththegreedygraphmethodandtwooftheone-classclassiﬁcationmethods(nearestneighbordatadescriptionmethodandk-nearestneighbordatadescriptionmethod)thatgaveusthebestresultsamongall.

Toconstructthegraph,weﬁrstextractedsiftfeaturesfromeachfaceimageandclusteredthesefeaturesusingk-meansclusteringinto50clusters.Then,ahistogramofthesizeofnumberofclusters(50)isformedforeachimageshowing

CHAPTER5.COMPARISON57

thedistributionofclusters.InInformationRetrieval,thefrequenciesoftheclus-tersareweightedby’termfrequencyinversedocumentfrequency(tf-idf)’whichiscomputedas:

nidNlogndni

tfidf=(5.1)

where,nidisnumberofoccurrencesoftermiindocumentd,ndistotalnum-beroftermsindocumentd,Nisthetotalnumberofdocumentsindatabaseandniisthenumberofdocumentsindatabasecontainingtermi.

Adaptingthesameapproach,weﬁndtheweightedfrequenciesoftheclustersandusethemastheﬁnalfeaturevectorforeachimage.Theone-classclassi-ﬁcationmethodscanthenbeappliedonthesefeatures.Toapplythedensestcomponentgraphalgorithm,thesimilarityamongthefacesarefoundbynormal-izedscalarproductoftﬁdfvectorsbyusingthefollowingequation:

tfidf(f1)∗tfidf(f2)norm(f1)∗norm(f2)

distance(f1,f2)=(5.2)

where,tfidf(f1)istf-idfvectoroffaceimagef1andnorm(f1)isthenormoftﬁdfvectoroffaceimagef1.ThesimilaritymatrixconstructedasdescribedaboveforthequerynameHansBlixisshowninFigure5.7.Theprecision-recallcurveofthedensestcomponentgraphalgorithmforvaryinggraphthresholdsisgiveninFigure5.8.Recallandprecisionvaluesofone-class-classiﬁcationmethodsfordiﬀerentK-foldvalidationtestsaregiveninTable5.3.Theresultsindicatethattheonce-classclassiﬁcationmethodsisnotsuperiortothegreedydensestcomponentalgorithmusedinourexperimentsforﬁndingthemostsimilarsetoffaces.

CHAPTER5.COMPARISON58

Table5.3:Recall-precisionratesoftwooneclassclassiﬁcationmethods:w1(near-estneighbordatadescriptionmethod)andw2(k-nearestneighbordatadescrip-tionmethod)(appliedontﬁdf’s).

trainingsetK=10K=20K=30K=40K=50recprerecprerecprerecprerecprew11.000.541.000.531.000.521.000.521.000.51w20.900.570.900.550.900.540.900.540.900.53testsetK=10K=20K=30K=40K=50recprerecprerecprerecprerecprew1.900.500.910.500.900.500.900.500.900.49w20.840.530.880.540.870.530.860.530.860.52CHAPTER5.COMPARISON59

Figure5.2:Examplesformatchingpoints.Firstcolumnformatchesfoundbyusingtheoriginalmatchingmetricofsift.Secondcolumnformatchesfoundbyapplyingtheproposedmethod.

CHAPTER5.COMPARISON60

Figure5.3:Similaritymatrixfor201imagesinthesearchspaceforthenameHansBlix.Inthissearchspace,98oftheimagesaretrueHansBliximages,andtheremaining103arenot.Forvisualization,the98trueBliximagesareputonthetopleftofthematrix.Darkcolorscorrespondtolargersimilarityvalues.

Figure5.4:Recall-precisioncurveof23peopleinthetestset.Precisionandrecallvalueschangedependingonthethreshold.

CHAPTER5.COMPARISON61

Figure5.5:Examplesformatchingpointsassignedbythehomographymatrixfoundafterransac.

Figure5.6:Recognitionratesofthek-nnapproachfordiﬀerentKandkvalues.Recallvaluesontheleftandprecisionontheright.Kisthepercentageoftheimagesusedfortesting,andkisthenumberofneighborsink-nn.

CHAPTER5.COMPARISON62

Figure5.7:Similaritymatrixfor201imagesinthesearchspaceforthenameHansBlix.Inthissearchspace,98oftheimagesaretrueHansBliximages,andtheremaining103arenot.Forvisualization,the98trueBliximagesareputonthetopleftofthematrix.Darkcolorscorrespondtolargersimilarityvalues.

Figure5.8:Recall-precisioncurveof23peopleinthetestset.Precisionandrecallvalueschangedependingonthethreshold.

Chapter6

ConclusionsandFutureWork

6.1

Conclusions

Inthispaper,weproposeagraphbasedmethodforqueryingpeopleinlargenewsphotographandvideocollectionswithassociatedcaptionsorspeechtranscripttexts.Givensimilaritymeasuresbetweenthefaceimagesinadataset,theproblemistransformedintoagraphprobleminwhichweseekthelargestdensestcomponentofthegraphcorrespondingtothelargestgroupofsimilarfaces.WeuseSIFTdescriptors[35]torepresenteachfaceimageanddeﬁnethesimilarityvaluesbyusingtheaveragedistancesofthematchinginterestpoints.Then,weapplyagreedygraphalgorithm[15]toﬁndthedensestcomponentofthegraphcorrespondingtothefacesofthequeryperson.Inthepaper,wealsoproposetwodiﬀerentmethodstousetheresultsofthegraphbasedpersonﬁndingapproachinfutherrecognitionofnewfaces.

Forlargerealisticdatasets,facerecognitionandretrievalisstilladiﬃcultandanerror-proneproblemduetolargevariationsinpose,illuminationandexpressions.Inthisstudy,wehavedescribedamulti-modalapproachforqueryinglargenumbersofpeopleinsuchdatasets.Themethoddoesnotrequiretrainingforanyspeciﬁcpersonandthusitcanbeappliedtoanynumberofpeople.Withthisproperty,itissuperiortoanysupervisedmethodwhichrequireslabelingof

CHAPTER6.CONCLUSIONSANDFUTUREWORK64

largenumberofsamples.Theresultsachievedarealsoveryclosetosupervisedmethods.

Theexperimentsareconductedontwodiﬀerentnewsdatasets.TheﬁrstsetconsistsofthousandsofnewsphotographswithassociatedcaptionscollectedfromYahoo!news.Thecaptionsareusedtolimitthenumberofimagesforaquerynameandonlytheimagesassociatedwiththenameareselected.Inthisdata,over20%increaseinprecisionisachievedcomparedtosolelytext-basedmethods.Forindividuals,upto84%recalland100%precisionvaluescanbeobtained.

Thesecondexperimentsareconductedon229broadcastnewsvideosarchive.Weﬁstusethespeechtranscriptsandselecttheneighboringshotsinwhichthenameofthequerynameappearstolimitoursearchspace.Applyingtheproposedpersonﬁndingalgorithmoneachvideoseparately,wedetecttheanchorpersonineachvideo.Then,weremovedetectedanchorpersonfromthesearchspaceofthequerynameandapplythealgorithmtotheremainingimages.Experimentsshowthatweimprovepersonsearchperformancesrelativetoonlytextbasedresults.Averageprecisionvaluesofonlytextbasedresultsareincreasedby29%afteranchorpersonremoval,andby152%afterapplyingtheproposedalgorithm.Thepersonﬁndingalgorithmalsoperformswellforanchorpersondetectionwithoutrequiringanysupervision.

Theproposedmethodisanoverallschemetoﬁndandrecognizefacesinlargenewsphotographandvideocollections.Eachstepofthemethodcanbecommitedwithanothertecniques,forinstancesimilaritiesbetweenfacescanbeassignedwithadiﬀerentapproachorthedensestcomponentcanbeextractedwithanothergraphpartitioningalgorithm.However,experimentsshowthatoursimilaritydeﬁnitionworksbetterthanothertraditionalapproachesforthisdataset;andthegreedygraphalgorithmiscomparabletooneotherthemostpossibleapproach,namelyone-classclassiﬁcation.

Oneoftheimportantremarkstobemadeonthemethodisthatevenifitisnotafacerecognitionschemeonthewhole,itisinstrumentalinreducingthenumberofimagespresentedtotheuserbyimprovingtheretrievalperformance

CHAPTER6.CONCLUSIONSANDFUTUREWORK65

ofbaselinemethods.

6.2FutureWork

Beforeapplyingthegreedydensestcomponentalgorithm,weconvertourweighedgraphconsistingofdissimilarityvaluesintoabinarygraph.However,thisignoressomeoftheinformation.Amethod,whichdoesnotviolatetheweightedpropertyofthegraph,mayyieldbetterresults.Inthisstudy,SIFTdescriptorsareusedtorepresentthesimilarityofthefaces.Otherrepresentationsorsimilaritymeasurescanalsobeusedtoconstructthegraphstructure.

In[28]setsoffaceexemplarsforeachpersonaregatheredautomaticallyinshotsfortrackinginvideo.Asimilarapproachcanbeadaptedandinsteadoftakingasinglefacefromeachshotbyonlyconsideringthekey-frames,facedetectioncanbeappliedtoallframestoobtainmoreinstancesofthesame.Thisapproachcanhelptoﬁndbettermatchinginterestpointsandmoreexamplesthatcanbeusedinthegraphalgorithm.

Sincetheproposedapproachisanoverallscheme,itcanalsobeappliedtootherproblemssuchasobjectrecognitionorimageregionannotation.Forin-stance,inthecontextofregionannotation,annotationsofimagescanbeusedforlimitingthesearchspaceforaregion.Similary,inthislimitedspace,theregionofinterestisexpectedtoformthelargestsimilargroupofregions.

Bibliography

[1]Trecvideoretrievalevaluation

http://www-nlpir.nist.gov/projects/trecvid/,2004.

[2]J.Sivic.EveringhamandA.Zisserman.Hello!mynameis...buﬀy–auto-maticnamingofcharactersintvvideo.InProceedingsoftheBritishMachineVisionConference,2006.

[3]S.AksoyandR.M.Haralick.Graph-theoreticclusteringforimagegrouping

andretrieval.InIEEEInternationalConferenceonComputerVisionandPatternRecognition(CVPR),pages63–69,1999.

[4]K.Barnard,P.Duygulu,N.deFreitas,D.A.Forsyth,D.Blei,andM.Jor-dan.Matchingwordsandpictures.JournalofMachineLearningResearch,3,2003.

[5]M.Bartlett,H.Lades,andT.Sejnowski.Independentcomponentrepresen-tationsforfacerecognition,1998.

[6]S.Belongie,J.Malik,andJ.Puzicha.Shapematchingandobjectrecognition

usingshapecontexts,2001.

[7]S.Beretti,A.DelBimbo,andE.Vicario.Eﬃcientmatchingandindexing

ofgraphmodelsincontent-basedretrieval.IEEETransactionsonPatternAnalysisandMachineIntelligence,23(10):1089–1105,2001.

[8]T.Berg,A.C.Berg,J.Edwards,andD.A.Forsyth.Who’sinthepicture.

InNeuralInformationProcessingSystems(NIPS),2004.

BIBLIOGRAPHY67

[9]T.Berg,A.C.Berg,J.Edwards,M.Maire,R.White,Y.-W.Teh,

E.Learned-Miller,andD.A.Forsyth.Namesandfacesinthenews.InIEEEConf.onComputerVisionandPatternRecognition(CVPR),2004.[10]M.Bicego,A.Lagorio,E.Grosso,andM.Tistarelli.Ontheuseofsift

featuresforfaceauthentication.InCVPRW’06:Proceedingsofthe2006ConferenceonComputerVisionandPatternRecognitionWorkshop,page35,Washington,DC,USA,2006.IEEEComputerSociety.

[11]V.Bruce.Recognizingfaces,1988.LawrenceErlbaumAssociates,London,

U.K.

[12]J.M.FellousC.vonderMalsburg,N.KrugerandL.Wiskott.Facerecogni-tionbyelasticbunchgraphmatching.InComputerAnalysisofImagesandPatterns1997,pages456–463,1997.

[13]C.Carson,S.Belongie,H.Greenspan,andJitendraMalik.Blobworld:Image

segmentationusingexpectation-maximizationanditsapplicationtoimagequerying.IEEETransactionsonPatternAnalysisandMachineIntelligence,24(8),2002.

[14]M.LaCascia,S.Sethi,andS.Sclaroﬀ.Combiningtextualandvisualcues

forcontent-basedimageretrievalontheworldwideweb.InProc.IEEEWorkshoponContent-BasedAccessofImageandVideoLibraries,SantaBarbaraCAUSA,June1998.

[15]M.Charikar.Greedyapproximationalgorithmsforﬁndingdensecomponents

inagraph.InAPPROX’00:Proc.ofthe3rdInternationalWorkshoponApproximationAlgorithmsforCombinatorialOptimization,London,UK,2000.

[16]F.Chen,U.Gargi,L.Niles,andH.Schuetze.Multi-modalbrowsingof

imagesinwebdocuments.InProceedingsofSPIEDocumentRecognitionandRetrievalVI,1999.

[17]M-Y.ChenandA.Hauptmann.Searchingforaspeciﬁcpersoninbroadcast

newsvideo.InInternationalConferenceonAcoustics,Speech,andSignalProcessing(ICASSP’04),Montreal,Canada,May17-212004.

BIBLIOGRAPHY68

[18]S.Dickinson,M.Pelillo,andR.Zabih.Introductiontothespecialsection

ongraphalgorithmsincomputervision.IEEETransactionsonPatternAnalysisandMachineIntelligence,23(10):1049–1052,2001.

[19]P.DuyguluandA.Hauptmann.What’snews,what’snot?associatingnews

videoswithwords.InThe3rdInternationalConferenceonImageandVideoRetrieval(CIVR)Ireland,July21-23,2004.

[20]M.A.FischlerandR.C.Bolles.Randomsampleconsensus:aparadigmfor

modelﬁttingwithapplicationstoimageanalysisandautomatedcartography.Commun.ACM,24(6):381–395,1981.

[21]J.L.Gauvain,L.Lamel,andG.Adda.Thelimsibroadcastnewstranscrip-tionsystem.SpeechCommunication,37(1-2),2002.

[22]Y.Gdalyahu,D.Weinshall,andM.Werman.Self-organizationinvision:

Stochasticclusteringforimagesegmentation,perceptualgrouping,andim-agedatabaseorganization.IEEETransactionsonPatternAnalysisandMa-chineIntelligence,23(10):1053–1074,2001.

[23]R.Gross,S.Baker,I.Matthews,andT.Kanade.Facerecognitionacross

poseandillumination.InStanZ.LiandAnilK.Jain,editors,HandbookofFaceRecognition.SpringerVerlag,2004.

[24]R.Gross,J.Shi,andJ.Cohn.Quovadisfacerecognition?InThirdWork-shoponEmpiricalEvaluationMethodsinComputerVision,2001.

[25]G.HamerlyandC.Elkan.Learningthekink-means.InAdvancesinNeural

InformationProcessingSystems,volume17,2003.

[26]S.HelmerandD.G.Lowe.Objectrecognitionwithmanylocalfeatures.

InWorkshoponGenerativeModelBasedVision2004(GMBV),WashingtonD.C.,2004.

[27]N.IkizlerandP.Duygulu.Personsearchmadeeasy.InTheFourthIn-ternationalConferenceonImageandVideoRetrieval(CIVR),Singapore,2005.

BIBLIOGRAPHY69

[28]M.EveringhamJ.SivicandA.Zisserman.Personspotting:videoshot

retrievalforfacesets.InInternationalConferenceonImageandVideoRetrieval(CIVR),Singapore,2005.

[29]G.M.DaviesJ.W.ShepherdandH.D.Ellis.Studiesofcuesaliency,1981.

InPerceivingandRememberingFaces,G.M.Davies,H.D.Ellis,andJ.W.Shepherd,Eds.AcademicPress,London,U.K.

[30]I.JermynandH.Ishikawa.Globallyoptimalregionsandboundariesas

minimumratioweightcycles.IEEETransactionsonPatternAnalysisandMachineIntelligence,23(10):1075–1088,2001.

[31]Y.KayaandK.Kohayashi.Abasicstudyonhumanfacerecognition,1972.

FrontiersofPatternRecognition,S.Watanabe,ed.,p.265.

[32]MichaelDavidKelly.Visualidentiﬁcationofpeoplebycomputer.PhDthesis,

1971.

[33]S.Lazebnik,C.Schmid,andJ.Ponce.Beyondbagsoffeatures:Spatial

pyramidmatchingforrecognizingnaturalscenecategories.InCVPR,2006.[34]H.LeandH.Li.Recognizingfrontalfaceimagesusinghiddenmarkovmodels

withonetrainingimageperperson.InICPR’04:ProceedingsofthePatternRecognition,17thInternationalConferenceon(ICPR’04)Volume1,pages318–321,Washington,DC,USA,2004.IEEEComputerSociety.

[35]D.G.Lowe.Distinctiveimagefeaturesfromscale-invariantkeypoints.In-ternationalJournalofComputerVision,60(2),2004.

[36]A.PentlandM.Turk.Eigenfacesforrecognition.JournalofCognitiveNeu-rosicence,3(1):71–86,1991.

[37]A.P.PentlandM.A.Turk.Facerecognitionusingeigenfaces.InIEEECon-ferenceonComputerVisionandPatternRecognition,1991.

[38]B.S.Manjunath,R.Chellappa,andC.vonderMalsburg.Afeaturebased

approachtofacerecognition.InIEEEConf.onComputerVisionandPat-ternRecognition(CVPR),1992.

BIBLIOGRAPHY70

[39]K.MikolajczkandC.Schmid.Aperformanceevaluationoflocaldescriptors.

InIEEEConferenceonComputerVisionandPatternRecognition,2003.[40]K.Mikolajczyk.Facedetector.INRIARhone-Alpes,2004.Ph.DReport.[41]K.MikolajczykandC.Schmid.Indexingbasedonscaleinvariantinterest

points.InInternationalConferenceonComputerVision(ICCV),pagesI:525–531,2001.

[42]B.Park.Facerecognitionusingface-argmatching.IEEETrans.Pattern

Anal.Mach.Intell.,27(12):1982–1988,2005.Member-Kyoung-MuLeeandMember-Sang-UkLee.

[43]P.E.HartR.O.DudaandD.G.Stork.InPatternclassiﬁcation.JohnWiley

andSons,2001.

[44]S.SatohandT.Kanade.Name-it:Associationoffaceandnameinvideo.

InProceedingsofIEEEConferenceonComputerVisionandPatternRecog-nition(CVPR),1997.

[45]J.ShiandJ.Malik.Normalizedcutsandimagesegmentation.IEEETrans-actionsonPatternAnalysisandMachineIntelligence,22(8):888–905,2000.[46]J.SivicandA.Zisserman.Videogoogle:Atextretrievalapproachtoobject

matchinginvideos.InICCV’03:ProceedingsoftheNinthIEEEInterna-tionalConferenceonComputerVision,page1470,Washington,DC,USA,2003.IEEEComputerSociety.

[47]C.G.M.SnoekandM.Worring.Multimodalvideoindexing:Areviewofthe

state-of-the-art.MultimediaToolsandApplications,25(1),2005.

[48]X.Tan,S.Chen,Z.Zhou,andF.Zhang.Facerecognitionfromasingle

imageperperson:Asurvey.PatternRecognition.,39(9):1725–1745,2006.[49]R.ChellappaW.Zhao.Image-basedfacerecognition:Issuesandmethods.

ImageRecognitionandClassiﬁcation,pages375–402,2002.

BIBLIOGRAPHY71

[50]J.Wang,K.N.Plataniotis,andA.N.Venetsanopoulos.Selectingdis-criminanteigenfacesforfacerecognition.26(10):1470–1482,2005.

[51]A.Webb.InStatisticalPatternRecognition.JohnWileyandSons,2002.[52]J.Yang,M-Y.Chen,andA.Hauptmann.Findingpersonx:Correlating

nameswithvisualappearances.InInternationalConferenceonImageandVideoRetrieval(CIVR),DublinCityUniversityIreland,2004.

[53]J.Yang,D.Zhang,A.F.Frangi,andJ.Yang.Two-dimensionalpca:Anew

approachtoappearance-basedfacerepresentationandrecognition.IEEETrans.PatternAnal.Mach.Intell.,26(1):131–137,2004.

[54]W.Zhao,R.Chellappa,P.J.Phillips,andA.Rosenfeld.Facerecognition:

Aliteraturesurvey.ACMComputingSurveys,35(4):399–458,2003.

PatternRecognitionLetters,

AppendixA

DiﬀerentFormsofNamesinNewsPhotographs

GeorgeBushGeorgeW(1485),W.Bush(1462),GeorgeW.Bush(1454),PresidentGeorgeW(1443),PresidentBush(905),U.S.President(722),PresidentGeorgeBush(44),PresidentBushs(2),PresidentGeorgeWBush(2),GeorgeWBush(2)

SaddamHussein(911),PresidentSaddam(356),PresidentSaddamHussein(351),PresidentSaddamHussien(3),PresidentSaddamHussiein(1),MinisterSaddam(1)ColinPowell(618),SecretaryofStateColinPowell(606)SecretaryColinPowell(4),CollinPowell(3),

SecretaryGeneralColinPowell(2),SecretaryPowell(1),SecretaryofStatePowell(1)

TonyBlair(502),PrimeMinisterTonyBlair(472),PremierTonyBlair(2),PrimeMinisterBlair(2),MrBlair(1),PrimeMinisterTonyBair(1)

PrimeMinisterJean(158),JeanChretien(155),MinisterJeanChretien(145),PrimeMinisterJeanChretien(145),PrimeMinisterJohnChretien(2),PrimeMinisterChretien(2)GerhardSchroeder(311),ChancellorGerhardSchroeder(283)ChancellorSchroeder(1),ChancellorGerhardSchroeders(2)

ChancellorGerhardSchroder(1),ChancellorGerhardSchoeder(1)

SaddamHusseinColinPowellTonyBlairJeanChretienGerhardSchroeder72

APPENDIXA.DIFFERENTFORMSOFNAMESINNEWSPHOTOGRAPHS73

JohnAshcroftDonaldRumsfeldArielSharonsJunichiroKoizumiHugoChavezGeneralKoﬁRohMoo-hyunLuladaJacquesChiracVladimirPutinAbdullahGulJiangZeminJohnPaulSilvioBerlusconiDavidBeckhamGrayDavisHansBlixJohnAshcroft(147),GeneralJohnAshcroft(146)AttorneyGeneralJohnAshcroft(143),U.S.Attorney(106)U.S.Attorney(2),U.SAttorney(1)DonaldRumsfeld(279),DonaldH(47),SecretaryDonaldRumsfeld(84),DonaldH.Rumsfeld(44),SecretaryDonaldH(26),H.Donald(13),

SecretaryofStateDonaldRumsfeld(4),SecretaryRumsfeld(6)MinisterAriel(249),PrimeMinisterAriel(248)PrimeMinisterArielSharons(2)

JunichiroKoizumi(156),PrimeMinisterJunichiro(151)PrimeMinisterJunichiroKoizumi(149),PrimeMinisterKoizumi(1)HugoChavez(194),PresidentHugoChavez(186),PresidentHugoChaves(1),PresidentChavez(3)GeneralKoﬁ(124),SecretaryGeneralKoﬁ(61)Secretary-GeneralKoﬁ(60),GeneralGeneralKoﬁ(1)

Secretary-GenaralKoﬁ(1),Annan,U.N.(1),U.N.Secretary(57)U.N.Secretary-(39),U.N.General(9)

RohMoo-hyun(86),RohMoo-(93),PresidentRohMoo-hyun(55)President-electRohMoo-hyun(10),PresidentRoh(61),PresidentRohMoo-(61),President-ElectRohMoo(1)Lulada(119),PresidentLuizInacioLula(30),President-electLuizInacioLula(19),PresidentLula(7),PresidentLulaDa(5),President-electLulaDa(2)

PresidentLuizLulaDa(1),PresidentLuisInacioLula(1)

President-electLuisInacioLula(1),LuizInacio(105),LuisInacio(4)JacquesChirac(143),PresidentJacquesChirac(138)PresidentChirac(4),PresidentJaquesChirac(3)VladimirPutin(146),PresidentPutin(4)PresidentVladimirPutin(136)

AbdullahGul(84),MinisterAbdullahGul(57)PrimeMinisterAbdullahGul(47),PremierAbdullahGul(9)MinisterAbdullah(74)

JiangZemin(94),PresidentJiang(85),PresidentJiangZemin(85),GeneralSecretaryJiangZemin(2)JohnPaul(135),JohnPaulII(57),JohnPaulII(42)SilvioBerlusconi(113),PrimeMinisterSilvioBerlusconi(81)PremierSilvioBerlusconi(22),PrimeMinisterSivlioBerlusconi(1)DavidBeck(94),DavidBeckham(93),captainDavidBeckham(17),captainBeckham(5)

GrayDavis(109),Gov.GrayDavis(73),GovernorGrayDavis(26)HansBlix(168),InspectorHansBlix(17),Dr.HansBlix(13),U.N.HansBlix(3)

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文

全部栏目

ABSTRACT A GRAPH BASED APPROACH FOR FINDING PEOPLE IN NEWS