In [40]:
import PyPDF2

paperFile = open('PixelCNN.pdf', 'rb')
print(type(paperFile))
pdfReader = PyPDF2.PdfFileReader(paperFile)
<class '_io.BufferedReader'>
In [34]:
page8 = pdfReader.getPage(7).extractText()
print(page8)
MinimalGatedUnitforRecurrentNeuralNetworks
numberofparameters,aswasdonein(
Chungetal.
,
2014
).
ThecomparisonresultsareshowninTable
2
.
MGUhasonethirdlessparametersthanGRU.Thus,the
numberofparametersareroughlythesameforMGUwith
500hiddenunitsandGRUwith400hiddenunits.When
wecomparethesetwoalgorithmsusingthesesettings,the
gapbetweenGRUandMGUbecomessmaller(102.33vs.
105.89onthetestset,and107.92vs.110.68onthevalida-
tionset).
Ifwecomparethesetwoalgorithmswiththesameamount
oftrainingtime,MGUisfasterthanGRU.MGUwith500
unitsisroughlyasfastasGRUwith300units;and,MGU
with300unitsissimilartoGRUwith100units.When
thenumbersofhiddenunitsaresame(e.g.,500),thepro-
posedMGUcanrunmoreepochsthanGRUgiventhesame
amountoftrainingtime,whichweexpectwillcontinueto
decreasetheperplexityofMGU.
WecanalsocompareMGUwiththeresultsin(
Jozefow-
iczetal.
,
2015
).Whenthereare400hiddenunits,the
totalnumberofparameters(includingalllayers)ofMGU
is4.8M,whichcanbefairlycomparedwiththefi5M-tstfl
resultin(
Jozefowiczetal.
,
2015
,Table3),soisGRU
with300units.OurGRUimplementation(with300hid-
denunits)hasatestsetperplexityof102.55,lowerthan
theGRU(with5Mparameters)resultin(
Jozefowiczetal.
,
2015
),whichis108.42(
=exp(4
:
684)
).Theproposed
MGU(with400hiddenunits)achievesatestsetperplexity
of106.02,alsolowerthantheGRUresultin(
Jozefowicz
etal.
,
2015
).
TheSCRNmethodhasstillfewerparametersthanthe
proposedMGU.Whenthereare100hiddenunits,MGU
has60,400parameters.AsimilarSCRNarchitecturehas
100hiddenunitsand40contextunits(see
Mikolovetal.
,
2015
,Table1),whichhas48,200parameters,amounting
toroughly80%ofthatofMGU.Onthisdataset,however,
SCRNseemstosaturateattestsetperplexity115,because
SCRNwith100and300hiddenunitsarrivedatthissame
perplexity.MGUgetslowerperplexitythanSCRNonthis
dataset.
4.5.Discussions
WehaveevaluatedtheproposedMGUonfourdifferentse-
quencedata.ThecomparisonismainlyagainstGRU,while
resultsofIRNNandSCRNarealsocitedwhenappropri-
ate.Theinputsequencelengthrangesshort(35,50Œ55),
moderate(128),andlong(784).Thesequencedatarange
fromtoreal-world,andthetaskdomainsarealso
diverse.
TheproposedmethodisonparwithGRUintermsofac-
curacy(orerror,orperplexity).Givenitsminimaldesign
ofonegate,MGUhasonlytwothirdsoftheparameters
ofGRU,andhencetrainsfasterinalldatasets.However,
insomeproblems(e.g.,PennTreeBank),GRUconverges
fasterthanMGU.Overall,throughtheseexperimentalre-
sultswebelieveMGUhasprovenitselfasanattractiveal-
ternativeinbuildingRNN.
5.ConclusionsandFutureWork
Inthispaper,weproposedanewhiddenunitforrecur-
rentneuralnetworks.TheproposedMinimalGatedUnit
(MGU)hastheminimaldesigninanygatedhiddenunit
forRNN.Ithasonlyonegate(theforgetgate)anddoes
notinvolvethepeepholeconnection.Hence,thenumber
ofparametersinMGUisonlyhalfofthatintheLong
Short-TermMemory(LSTM),ortwothirdsofthatinthe
GatedRecurrentUnit(GRU).WecomparedMGUwith
GRUonseveraltasksthatdealwithsequencedatainvar-
iousdomains.MGUhasachievedcomparableaccuracy
withGRU,and(thankstotheminimaldesign)trainsfaster
thanGRU.
Basedonourevaluations,MGUcouldbereadilyusedas
thehiddenunitinanRNN,whichmayreducememory
footprintandtrainingtimeinsomeapplications.Moreim-
portantly,theminimaldesignwillfacilitateourtheoretical
analysisorempiricalobservation(e.g.,throughvisualiza-
tion)ofRNNmodels,andenhanceourunderstandingand
facilitateprogressesinthisdomain.Aminimaldesignalso
meansthattherearefewerpossibilitiesofproducingvari-
ants(whichwillcomplicatetheanalysis).
Amplewaysarepossibletofurtherthislineofresearch.
Beyondanalysisandunderstanding,wewillalsorunMGU
withmoreepochs,inmorediverseandcomplextasks,and
regularizeMGUtoimproveitsaccuracy.
References
Bahdanau,D.,Cho,K.,andBengio,Y.NueralMachine
TranslationbyJointlyLearningtoAlignandTranslate.
In
Int'lConfLearningRepresenations
,2015.
Bengio,Y.,Simard,P.,andFrasconi,P.LearningLong-
termDependencieswithGradientDescentisDif
IEEETrans.NeuralNetworks
,5(2):157Œ166,1994.
Cho,K.,vanMeri
¨
enboer,B.,Gulcehre,C.,Bahdanau,D.,
Bougares,F.,Schwenk,H.,andBengio,Y.Learning
PhraseRepresentationsusingRNNEncoderŒDecoder
forStatisticalMachineTranslation.In
Proc.Empiri-
calMethodsinNaturalLanguageProcessing
,pp.1724Œ
1735,2014.
Chung,J.,Gulcehre,C.,Cho,K.,andBengio,Y.Empir-
icalEvaluationofGatedRecurrentNeuralNetworkson

In [22]:
import re
from collections import OrderedDict

def del_list(w_array, index):
    del w_array[:index]

def getTitle(w_array):
    #insert index and _sre.SRE_Match Object ,but items insert None ,when cannot search query
    #array = {i:re.search('[1-2][0-9][0-9][0-9]',w) for i,w in enumerate(w_array)}
    array = OrderedDict()
    for i, w in enumerate(w_array):
        array[i] = re.search('[1-2][0-9][0-9][0-9]',w)
        
    #year when this paper was published.
    #years = {i+1:state.group(0) for i,state in array.items() if state is not None}
    years = OrderedDict()
    for i,state in array.items():
        if state is not None: years[i+1] = state.group(0)

    titles = []
    previous_i = 0
    for i, year in years.items():
        if 1950 < int(year) and int(year) <= 2017:
            title = ''.join(w_array[previous_i:i]).split(']')
            titles.append(title[0] if not(len(title) > 1) else title[1])
            previous_i = i
    return titles
    
numOfPages = pdfReader.getNumPages()
titles = []
p_frag = False
for i in range(numOfPages):
    page = pdfReader.getPage(i).extractText().split('\n')
    if 'References' in page:
        del_list(page, page.index('References') + 1)
        p_frag =True
    elif not p_frag:
        continue
    
    #print(page)
    title_array = (getTitle(page))
    for i in range(len(title_array)): titles.append(title_array[i])

print(titles)
['Abadi,AshishAgarwal,PaulBarham,EugeneBrevdo,ZhifengChen,CraigCitro,GregSCorrado,AndyDavis,JeffreyDean,MatthieuDevin,etal.Tw:Large-scalemachinelearningonheterogeneousdistributedsystems.arXivpreprintarXiv:1603.04467,2016.', 'MarcGBellemare,SriramSrinivasan,GeorgOstrovski,TomSchaul,DavidSaxton,andRemiMunos.Unifyingcount-basedexplorationandintrinsicmotivation.arXivpreprintarXiv:1606.01868,2016.', 'EmilyLDenton,SoumithChintala,RobFergus,etal.Deepgenerativeimagemodelsusingalaplacianpyramidofadversarialnetworks.InAdvancesinNeuralInformationProcessingSystems,pages1486Œ1494,2015.', 'LaurentDinh,DavidKrueger,andYoshuaBengio.NICE:Non-linearindependentcomponentsestimation.arXivpreprintarXiv:1410.8516,2014.', 'LeonAGatys,AlexanderSEcker,andMatthiasBethge.Aneuralalgorithmofartisticstyle.arXivpreprintarXiv:1508.06576,2015.', 'IanGoodfellow,JeanPouget-Abadie,MehdiMirza,BingXu,DavidWarde-Farley,SherjilOzair,AaronCourville,andYoshuaBengio.Generativeadversarialnets.InAdvancesinNeuralInformationProcessingSystems,pages2672Œ2680,2014.[7', 'KarolGregor,FredericBesse,DaniloJRezende,IvoDanihelka,andDaanWierstra.Towardsconceptualcompression.arXivpreprintarXiv:1601.06759,2016.', 'KarolGregor,IvoDanihelka,AlexGraves,andDaanWierstra.DRAW:Arecurrentneuralnetworkforimagegeneration.Proceedingsofthe32ndInternationalConferenceonMachineLearning,2015.', 'KarolGregor,IvoDanihelka,AndriyMnih,CharlesBlundell,andDaanWierstra.Deepautoregressivenetworks.InProceedingsofthe31stInternationalConferenceonMachineLearning,2014.', 'KaimingHe,XiangyuZhang,ShaoqingRen,andJianSun.Deepresiduallearningforimagerecognition.arXivpreprintarXiv:1512.03385,2015.', 'KaiserandIlyaSutskever.Neuralgpuslearnalgorithms.arXivpreprintarXiv:1511.08228,2015.', 'NalKalchbrenner,IvoDanihelka,andAlexGraves.Gridlongshort-termmemory.arXivpreprintarXiv:1507.01526,2015.', 'HugoLarochelleandIainMurray.Theneuralautoregressivedistributionestimator.TheJournalofMachineLearningResearch,2011.', 'ElmanMansimov,EmilioParisotto,JimmyLeiBa,andRuslanSalakhutdinov.Generatingimagesfromcaptionswithattention.arXivpreprintarXiv:1511.02793,2015.', 'JonathanMasci,UeliMeier,DanCire¸san,andJürgenSchmidhuber.Stackedconvolutionalauto-encodersforhierarchicalfeatureextraction.InNeuralNetworksandMachineLearningŒICANN2011', ',pages52Œ59.Springer,2011.', 'JunhyukOh,XiaoxiaoGuo,HonglakLee,RichardLLewis,andSatinderSingh.Action-conditionalvideopredictionusingdeepnetworksinatarigames.InAdvancesinNeuralInformationProcessingSystems,pages2845Œ2853,2015.[18', 'ScottReed,ZeynepAkata,XinchenYan,LajanugenLogeswaran,BerntSchiele,andHonglakLee.Generativeadversarialtexttoimagesynthesis.arXivpreprintarXiv:1605.05396,2016.', 'DaniloJRezende,ShakirMohamed,andDaanWierstra.Stochasticbackpropagationandapproximateinferenceindeepgenerativemodels.InProceedingsofthe31stInternationalConferenceonMachineLearning,2014.', 'DaniloJimenezRezende,ShakirMohamed,IvoDanihelka,KarolGregor,andDaanWierstra.One-shotgeneralizationindeepgenerativemodels.arXivpreprintarXiv:1603.05106,2016.', 'RuslanSalakhutdinov,JoshuaBTenenbaum,andAntonioTorralba.Learningwithhierarchical-deepmodels.PatternAnalysisandMachineIntelligence,IEEETransactionson,35(8):1958Œ1971,2013.', 'FlorianSchroff,DmitryKalenichenko,andJamesPhilbin.Facenet:Aembeddingforfacerecognitionandclustering.InProceedingsoftheIEEEConferenceonComputerVisionandPatternRecognition,pages815Œ823,2015.', 'JaschaSohl-Dickstein,EricA.Weiss,NiruMaheswaranathan,andSuryaGanguli.Deepunsupervisedlearningusingnonequilibriumthermodynamics.Proceedingsofthe32ndInternationalConferenceonMachineLearning,2015.', 'RupeshKSrivastava,KlausGreff,andJürgenSchmidhuber.Trainingverydeepnetworks.InAdvancesinNeuralInformationProcessingSystems,pages2368Œ2376,2015.[26', 'LucasTheis,AaronvandenOord,andMatthiasBethge.Anoteontheevaluationofgenerativemodels.arXivpreprintarXiv:1511.01844,2015.', 'BenignoUria,Marc-AlexandreCôté,KarolGregor,IainMurray,andHugoLarochelle.Neuralautoregres-sivedistributionestimation.arXivpreprintarXiv:1605.02226,2016.', 'AaronvandenOordandJoniDambre.Locally-connectedtransformationsfordeepgmms.InInternationalConferenceonMachineLearning(ICML):DeeplearningWorkshop,Abstracts,pages1Œ8,2015.', 'AaronvandenOord,NalKalchbrenner,andKorayKavukcuoglu.Pixelrecurrentneuralnetworks.arXivpreprintarXiv:1601.06759,2016.', 'AäronvandenOordandBenjaminSchrauwen.Factoringvariationsinnaturalimageswithdeepgaussianmixturemodels.InAdvancesinNeuralInformationProcessingSystems,2014.', 'AaronvandenOordandBenjaminSchrauwen.Thestudent-tmixtureasanaturalimagepatchpriorwithapplicationtoimagecompression.TheJournalofMachineLearningResearch,2014.']
In [31]:
searches = [re.search('arXivpreprintarXiv:[0-9]*.[0-9]*',title) for title in titles]
arxivs = [search.group(0) for search in searches if search is not None]
arxiv_article_ids = [arxiv.split(':')[1] for arxiv in arxivs]
arxiv_article_ids
Out[31]:
['1603.04467',
 '1606.01868',
 '1410.8516',
 '1508.06576',
 '1601.06759',
 '1512.03385',
 '1511.08228',
 '1507.01526',
 '1511.02793',
 '1605.05396',
 '1603.05106',
 '1511.01844',
 '1605.02226',
 '1601.06759']
In [33]:
from urllib.request import urlopen
base_url = 'https://arxiv.org/pdf/'

for arxiv_article_id in arxiv_article_ids:
        urlopen(base_url + arxiv_article_id + '.pdf')
In [8]:
import re
from collections import OrderedDict
import PyPDF2
from urllib.request import urlopen
from io import StringIO
from sqlite import SQLHelper

base_url = 'https://arxiv.org/pdf/'

def del_list(w_array, index):
    del w_array[:index]

def getTitle(w_array):
    #insert index and _sre.SRE_Match Object ,but items insert None ,when cannot search query
    #array = {i:re.search('[1-2][0-9][0-9][0-9]',w) for i,w in enumerate(w_array)}
    array = OrderedDict()
    for i, w in enumerate(w_array):
        array[i] = re.search('[1-2][0-9][0-9][0-9]',w)
        
    #year when this paper was published.
    #years = {i+1:state.group(0) for i,state in array.items() if state is not None}
    years = OrderedDict()
    for i,state in array.items():
        if state is not None: years[i+1] = state.group(0)

    titles = []
    previous_i = 0
    for i, year in years.items():
        if 1950 < int(year) and int(year) <= 2017:
            title = ''.join(w_array[previous_i:i])#.split(']')
            #titles.append(title[0] if not(len(title) > 1) else title[1])
            titles.append(title)
            previous_i = i
    return titles

def getArticleIds(titles):
    searches = [re.search('arXivpreprintarXiv:[0-9]*.[0-9]*',title) for title in titles]
    arxivs = [search.group(0) for search in searches if search is not None]
    arxiv_article_ids = [arxiv.split(':')[1] for arxiv in arxivs]
    return arxiv_article_ids

def read_pdf(article_id):
        pdf_byte = urlopen(base_url + arxiv_article_id + '.pdf').read()
        with open('./paper_pdf/' + arxiv_article_id + '.pdf','wb') as fs:
            fs.write(pdf_byte)
        pdfReader = PyPDF2.PdfFileReader(open('./paper_pdf/' + arxiv_article_id + '.pdf','rb'))
        return pdfReader

def recurrent_(SQLHelper, arxiv_article_ids, depth = 0):
    if depth is 5 : return
    
    for arxiv_article_id in arxiv_article_ids:
        try:
            pdfReader = read_pdf(arxiv_article_id)
        except:
            continue
        
        numOfPages = pdfReader.getNumPages()
        titles = []
        p_frag = False
        for i in range(numOfPages):
            page = pdfReader.getPage(i).extractText().split('\n')
            if 'References' in page:
                del_list(page, page.index('References') + 1)
                p_frag =True
            elif not p_frag:
                continue
    
            #print(page)
            title_array = (getTitle(page))
            for i in range(len(title_array)): titles.append(title_array[i])
        
        arXivIds = getArticleIds(titles)
        for i in range(len(arXivIds)): SQLHelper.Insert_Paper(arXivIds[i])
        print(arXivIds)
        recurrent_(SQLHelper,arXivIds,depth + 1)
        print(titles)

sqlHelper = SQLHelper('Paper.db')
sqlHelper.create_table()
recurrent_(sqlHelper,['1603.04467'])
['1412.7755', '1410.0759', '1404.5997', '1412.6564', '1507.04296', '1502.02072']
[]
[]
['1408.5093', '1408.2873', '1312.5851', '1312.6229', '1409.4842']
[]
[]
[]
['Dahl,G.E.,Sainath,T.N.,andHinton,G.E.ImprovingDeepNeuralNetworksforLVCSRusingRectiedLinearUnitsandDropout.InICASSP,2013.', 'Glorot,X.,Bordes,A.,andBengio,Y.DeepSparseRectierNetworks.InAISTATS,pp.315Œ323,2011.', 'Graves,A.andJaitly,N.TowardsEnd-to-EndSpeechRecognitionwithRecurrentNeuralNetworks.InICML,2014.', 'Graves,A.,Fern´andez,S.,Gomez,F.,andSchmidhuber,J.Connectionisttemporalclassication:Labellingunsegmentedsequencedatawithrecurrentneuralnetworks.InICML,pp.369Œ376.ACM,2006.', 'Hinton,G.E.,Deng,L.,Yu,D.,Dahl,G.E.,Mohamed,A.,Jaitly,N.,Senior,A.,Vanhoucke,V.,Nguyen,P.,Sainath,T.,andKingsbury,B.DeepNeuralNetworksforAcousticModelinginSpeechRecognition.IEEESignalProcessingMagazine,29(November):82Œ97,2012.', 'Hochreiter,S.andSchmidhuber,J.Longshort-termmemory.NeuralComputation,9:1735Œ1780,1997.Maas,A.,Hannun,A.,andNg,A.RectierNonlinearitiesImproveNeuralNetworkAcousticModels.InICMLWorkshoponDeepLearningforAudio,Speech,andLanguageProcessing,2013.', 'Povey,D.,Ghoshal,A.,Boulianne,G.,Burget,L.,Glembek,O.,Vesel´y,K.,Goel,N.,Hannemann,M.,Motlicek,P.,Qian,Y.,Schwarz,P.,Silovsky,J.,andStemmer,G.Thekaldispeechrecognitiontoolkit.InASRU,2011.', 'Sutskever,I.,Martens,J.,Dahl,G.,andHinton,G.OntheImportanceofMomentumandInitializationinDeepLearning.InICML,2013.', 'Zeiler,M.D.,Ranzato,M.,Monga,R.,Mao,M.,Yang,K.,Le,Q.V.,Nguyen,P.,Senior,A.,Vanhoucke,V.,Dean,J.,andHinton,G.E.OnRectiedLinearUnitsforSpeechProcessing.InICASSP,2013.']
[]
['[1]S.Ben-Yacoub,B.Fasel,andJ.Luttin.Fastfacedetectionusingmlpandfft.InProceedingsoftheSecondInternationalConferenceonAudioandVideo-basedBiometricPersonAuthen-(AVBPA1999)', ',1999.', '[2]ThierryBertin-Mahieux,DanielP.W.Ellis,BrianWhitman,andPaulLamere.Themillionsongdataset.InProceedingsofthe12thInternationalConferenceonMusicInformationRetrieval(ISMIR2011)', ',2011.', '[3]A.Bosch,A.Zisserman,andX.Munoz.Representingshapewithaspatialpyramidkernel.InProceedingsoftheACMInternationalConferenceonImageandVideoRetrieval,2007.', '[4]RonanCollobert,KorayKavukcuoglu,andClementFarabet.Torch7:Amatlab-likeenviron-mentformachinelearning.InNIPS,2011.', '[5]JamesCooleyandJohnTukey.Analgorithmforthemachinecalculationofcomplexfourierseries.MathematicsofComputation,(19):297Œ301,1965.', '[6]JiaDeng,WeiDong,RichardSocher,Li-JiaLi,KaiLi,andLiFei-Fei.Imagenet:Alarge-scalehierarchicalimagedatabase.2009.', '[7]L.Fei-Fei,R.Fergus,andPietroPerona.Learninggenerativevisualmodelsfromfewtrainingexamples:Anincrementalbayesianapproachtestedon101objectcategories.2004.', '[8]AlexKrizhevsky,IlyaSutskever,andGeoffreyE.Hinton.Imagenetwithdeepconvolutionalneuralnetworks.InNIPS,pages1106Œ1114,2012.[9]Y.LeCun,L.Bottou,G.Orr,andK.Muller.Efbackprop.InG.OrrandMullerK.,editors,NeuralNetworks:Tricksofthetrade.Springer,1998.', '[10]G.TzanetakisandP.Cook.Musicalgenreofaudiosignals.IEEETransactionsonSpeechandAudioProcessing,10(5):293Œ302,July2002.']
[]
['[1]J.Carreira,F.Li,andC.Sminchisescu.Objectrecognitionbysequentialgure-groundranking.Interna-tionaljournalofcomputervision,98(3):243Œ262,2012.', '[2]J.CarreiraandC.Sminchisescu.Constrainedparametricmin-cutsforautomaticobjectsegmentation,release1.http://sminchisescu.ins.uni-bonn.de/code/cpmc/.[3]D.C.Ciresan,J.Meier,andJ.Schmidhuber.Multi-columndeepneuralnetworksforimageclassication.InCVPR,2012.', '[4]M.DelakisandC.Garcia.Textdetectionwithconvolutionalneuralnetworks.InInternationalConferenceonComputerVisionTheoryandApplications(VISAPP2008)', ',2008.', '[5]J.Deng,W.Dong,R.Socher,L.-J.Li,K.Li,andL.Fei-Fei.ImageNet:ALarge-ScaleHierarchicalImageDatabase.InCVPR09,2009.', '[6]I.EndresandD.Hoiem.Categoryindependentobjectproposals.InComputerVisionŒECCV2010', ',pages575Œ588.Springer,2010.', '[7]C.Farabet,C.Couprie,L.Najman,andY.LeCun.Learninghierarchicalfeaturesforscenelabeling.IEEETransactionsonPatternAnalysisandMachineIntelligence,2013.inpress.', '[8]C.GarciaandM.Delakis.Convolutionalfacender:Aneuralarchitectureforfastandrobustfacedetection.IEEETransactionsonPatternAnalysisandMachineIntelligence,2004.', '[9]A.Giusti,D.C.Ciresan,J.Masci,L.M.Gambardella,andJ.Schmidhuber.Fastimagescanningwithdeepmax-poolingconvolutionalneuralnetworks.InInternationalConferenceonImageProcessing(ICIP),2013.', '[10]R.Hadsell,P.Sermanet,M.Scofer,A.Erkan,K.Kavackuoglu,U.Muller,andY.LeCun.Learninglong-rangevisionforautonomousoff-roaddriving.JournalofFieldRobotics,26(2):120Œ144,February2009.', '[11]G.Hinton,N.Srivastave,A.Krizhevsky,I.Sutskever,andR.R.Salakhutdinov.Improvingneuralnet-worksbypreventingco-adaptationoffeaturedetectors.arXiv:1207.0580,2012.[12]G.E.Hinton,A.Krizhevsky,andS.D.Wang.Transformingauto-encoders.InArticialNeuralNetworksandMachineLearningŒICANN2011', ',pages44Œ51.SpringerBerlinHeidelberg,2011.', "[13]V.Jain,J.F.Murray,F.Roth,S.Turaga,V.Zhigulin,K.Briggman,M.Helmstaedter,W.Denk,andH.S.Seung.Supervisedlearningofimagerestorationwithconvolutionalnetworks.InICCV'07.[14]K.Jarrett,K.Kavukcuoglu,M.Ranzato,andY.LeCun.Whatisthebestmulti-stagearchitectureforobjectrecognition?InProc.InternationalConferenceonComputerVision(ICCV'09).IEEE,2009.", '[15]A.Krizhevsky,I.Sutskever,andG.Hinton.Imagenetclassicationwithdeepconvolutionalneuralnet-works.InNIPS,2012.', '[16]Y.LeCun,B.Boser,J.S.Denker,D.Henderson,R.E.Howard,W.Hubbard,andL.D.Jackel.Hand-writtendigitrecognitionwithaback-propagationnetwork.InD.Touretzky,editor,AdvancesinNeuralInformationProcessingSystems(NIPS1989)', ',volume2,Denver,CO,1990.MorganKaufman.', "[17]Y.LeCun,L.Bottou,Y.Bengio,andP.Haffner.Gradient-basedlearningappliedtodocumentrecognition.ProceedingsoftheIEEE,86(11):2278Œ2324,November1998.[18]Y.LeCun,F.-J.Huang,andL.Bottou.Learningmethodsforgenericobjectrecognitionwithinvariancetoposeandlighting.InProceedingsofCVPR'04.IEEEPress,2004.", '[19]S.Manen,M.Guillaumin,andL.VanGool.Primeobjectproposalswithrandomizedprimsalgorithm.InInternationalConferenceonComputerVision(ICCV),2013.', '[20]O.Matan,J.Bromley,C.Burges,J.Denker,L.Jackel,Y.LeCun,E.Pednault,W.Sattereld,C.Stenard,andT.Thompson.Readinghandwrittendigits:Azipcoderecognitionsystem.IEEEComputer,25(7):59Œ63,July1992.', '[21]F.Ning,D.Delhomme,Y.LeCun,F.Piano,L.Bottou,andP.Barbano.Towardautomaticphenotypingofdevelopingembryosfromvideos.IEEETransactionsonImageProcessing,14(9):1360Œ1371,September2005.SpecialissueonMolecularandCellularBioimaging.', '[22]S.NowlanandJ.Platt.Aconvolutionalneuralnetworkhandtracker.pages901Œ908,SanMateo,CA,1995.MorganKaufmann.', '[23]M.Osadchy,Y.LeCun,andM.Miller.Synergisticfacedetectionandposeestimationwithenergy-basedmodels.JournalofMachineLearningResearch,8:1197Œ1215,May2007.[24]P.Sermanet,S.Chintala,andY.LeCun.Convolutionalneuralnetworksappliedtohousenumbersdigitclassication.InInternationalConferenceonPatternRecognition(ICPR2012),2012.', "[25]P.Sermanet,K.Kavukcuoglu,S.Chintala,andY.LeCun.Pedestriandetectionwithunsupervisedmulti-stagefeaturelearning.InProc.InternationalConferenceonComputerVisionandPatternRecognition(CVPR'13).IEEE,June2013.", "[26]P.SermanetandY.LeCun.Trafcsignrecognitionwithmulti-scaleconvolutionalnetworks.InProceed-ingsofInternationalJointConferenceonNeuralNetworks(IJCNN'11),2011.", '[27]G.Taylor,R.Fergus,G.Williams,I.Spiro,andC.Bregler.Pose-sensitiveembeddingbynonlinearncaregression.InNIPS,2011.', '[28]G.Taylor,I.Spiro,C.Bregler,andR.Fergus.Learninginvarancethroughimitation.InCVPR,2011.', '[29]J.R.R.Uijlings,K.E.A.vandeSande,T.Gevers,andA.W.M.Smeulders.Selectivesearchforobjectrecognition.InternationalJournalofComputerVision,104(2):154Œ171,2013.', '[30]R.Vaillant,C.Monrocq,andY.LeCun.Originalapproachforthelocalisationofobjectsinimages.IEEProconVision,Image,andSignalProcessing,141(4):245Œ250,August1994.']
[]
['[1]Knowyourmeme:Weneedtogodeeper.http://knowyourmeme.com/memes/we-need-to-go-deeper.Accessed:2014-09-15.', '[2]SanjeevArora,AdityaBhaskara,RongGe,andTengyuMa.Provableboundsforlearningsomedeeprepresentations.CoRR,abs/1310.6343,2013.[3]¨UmitV.C¸ataly¨urek,CevdetAykanat,andBoraUc¸ar.Ontwo-dimensionalsparsematrixpar-titioning:Models,methods,andarecipe.SIAMJ.Sci.Comput.,32(2):656Œ683,February2010.', "[4]JeffreyDean,GregCorrado,RajatMonga,KaiChen,MatthieuDevin,MarkMao,Marc'aurelioRanzato,AndrewSenior,PaulTucker,KeYang,QuocV.Le,andAndrewY.Ng.Largescaledistributeddeepnetworks.InP.Bartlett,F.c.n.Pereira,C.j.c.Burges,L.Bot-tou,andK.q.Weinberger,editors,AdvancesinNeuralInformationProcessingSystems25,pages1232Œ1240.2012.[5]DumitruErhan,ChristianSzegedy,AlexanderToshev,andDragomirAnguelov.Scalableob-jectdetectionusingdeepneuralnetworks.InComputerVisionandPatternRecognition,2014.", 'CVPR2014.IEEEConferenceon', ',2014.', '[6]RossB.Girshick,JeffDonahue,TrevorDarrell,andJitendraMalik.Richfeaturehierarchiesforaccurateobjectdetectionandsemanticsegmentation.InComputerVisionandPatternRecognition,2014.CVPR2014.IEEEConferenceon', ',2014.', '[7]GeoffreyE.Hinton,NitishSrivastava,AlexKrizhevsky,IlyaSutskever,andRuslanSalakhut-dinov.Improvingneuralnetworksbypreventingco-adaptationoffeaturedetectors.CoRR,abs/1207.0580,2012.[8]AndrewG.Howard.SomeimprovementsondeepconvolutionalneuralnetworkbasedimageCoRR,abs/1312.5402,2013.[9]AlexKrizhevsky,IlyaSutskever,andGeoffHinton.Imagenetwithdeepcon-volutionalneuralnetworks.InAdvancesinNeuralInformationProcessingSystems25,pages1106Œ1114,2012.[10]Y.LeCun,B.Boser,J.S.Denker,D.Henderson,R.E.Howard,W.Hubbard,andL.D.Jackel.Backpropagationappliedtohandwrittenzipcoderecognition.NeuralComput.,1(4):541Œ551,December1989.', '[11]YannLeCun,L´eonBottou,YoshuaBengio,andPatrickHaffner.Gradient-basedlearningappliedtodocumentrecognition.ProceedingsoftheIEEE,86(11):2278Œ2324,1998.[12]MinLin,QiangChen,andShuichengYan.Networkinnetwork.CoRR,abs/1312.4400,2013.[13]B.T.PolyakandA.B.Juditsky.Accelerationofstochasticapproximationbyaveraging.SIAMJ.ControlOptim.,30(4):838Œ855,July1992.', '[15]ThomasSerre,LiorWolf,StanleyM.Bileschi,MaximilianRiesenhuber,andTomasoPoggio.Robustobjectrecognitionwithcortex-likemechanisms.IEEETrans.PatternAnal.Mach.Intell.,29(3):411Œ426,2007.', "[16]FengguangSongandJackDongarra.Scalingupmatrixcomputationsonshared-memorymanycoresystemswith1000cpucores.InProceedingsofthe28thACMInternationalCon-ferenceonSupercomputing,ICS'14,pages333Œ342,NewYork,NY,USA,2014.ACM.", '[17]IlyaSutskever,JamesMartens,GeorgeE.Dahl,andGeoffreyE.Hinton.Ontheimportanceofinitializationandmomentumindeeplearning.InProceedingsofthe30thInternationalConferenceonMachineLearning,ICML2013,Atlanta,GA,USA,16-21June2013', ',volume28ofJMLRProceedings,pages1139Œ1147.JMLR.org,2013.[18]ChristianSzegedy,AlexanderToshev,andDumitruErhan.Deepneuralnetworksforobjectdetection.InChristopherJ.C.Burges,L´eonBottou,ZoubinGhahramani,andKilianQ.Weinberger,editors,AdvancesinNeuralInformationProcessingSystems26:27thAnnualConferenceonNeuralInformationProcessingSystems2013.Proceedingsofameetingheld', 'December5-8,2013,LakeTahoe,Nevada,UnitedStates.', ',pages2553Œ2561,2013.[19]AlexanderToshevandChristianSzegedy.Deeppose:Humanposeestimationviadeepneuralnetworks.CoRR,abs/1312.4659,2013.[20]KoenE.A.vandeSande,JasperR.R.Uijlings,TheoGevers,andArnoldW.M.Smeulders.Segmentationasselectivesearchforobjectrecognition.InProceedingsofthe2011Interna-', "tionalConferenceonComputerVision,ICCV'11,pages1879Œ1886,Washington,DC,USA,2011.IEEEComputerSociety.", '[21]MatthewD.ZeilerandRobFergus.Visualizingandunderstandingconvolutionalnetworks.InDavidJ.Fleet,Tom´asPajdla,BerntSchiele,andTinneTuytelaars,editors,ComputerVision-ECCV2014-13thEuropeanConference,Zurich,Switzerland,September6-12,2014,Pro-', 'ceedings,PartI,volume8689ofLectureNotesinComputerScience,pages818Œ833.Springer,2014.']
['[1]convnet-benchmarks.https://github.com/soumith/convnet-benchmarks,2014.', '[2]NetlibBLAS.http://www.netlib.org/blas/,2014.', '[3]NVIDIAcuDNN-GPUaccelerateddeeplearning.https://developer.nvidia.com/cuDNN,2014.', '[4]YoshuaBengio,AaronC.Courville,andPascalVincent.Unsupervisedfeaturelearninganddeeplearning:Areviewandnewperspectives.CoRR,abs/1206.5538,2012.[5]JamesBergstra,OlivierBreuleux,Fr´ed´ericBastien,PascalLamblin,RazvanPascanu,Guil-laumeDesjardins,JosephTurian,DavidWarde-Farley,andYoshuaBengio.Theano:acpuandgpumathexpressioncompiler.InSciPy,volume4,page3,2010.', '[6]KumarChellapilla,SiddPuri,PatriceSimard,etal.Highperformanceconvolutionalneuralnetworksfordocumentprocessing.InWorkshoponFrontiersinHandwritingRecognition,2006.', '[7]AdamCoates,BrodyHuval,TaoWang,DavidWu,BryanCatanzaro,andAndrewNg.DeeplearningwithCOTSHPCsystems.InICML,pages1337Œ1345,2013.[8]RonanCollobert,KorayKavukcuoglu,andCl´ementFarabet.Torch7:Amatlab-likeenviron-mentformachinelearning.InBigLearn,NIPSWorkshop,2011.', '[9]GeorgeEDahl,DongYu,LiDeng,andAlexAcero.Context-dependentpre-traineddeepneuralnetworksforlarge-vocabularyspeechrecognition.Audio,Speech,andLanguagePro-cessing,IEEETransactionson,20(1):30Œ42,2012.', '[10]GeoffreyHinton,LiDeng,DongYu,GeorgeEDahl,Abdel-rahmanMohamed,NavdeepJaitly,AndrewSenior,VincentVanhoucke,PatrickNguyen,TaraNSainath,etal.Deepneuralnet-worksforacousticmodelinginspeechrecognition:Thesharedviewsoffourresearchgroups.SignalProcessingMagazine,IEEE,29(6):82Œ97,2012.', '[11]YangqingJia,EvanShelhamer,JeffDonahue,SergeyKarayev,JonathanLong,RossGirshick,SergioGuadarrama,andTrevorDarrell.Caffe:Convolutionalarchitectureforfastfeatureembedding.arXivpreprintarXiv:1408.5093,2014.', '[12]AlexKrizhevsky.cudaconvnet2.https://code.google.com/p/cuda-convnet2/,2014.', '[13]AlexKrizhevsky,IlyaSutskever,andGeoffreyEHinton.Imagenetwithdeepconvolutionalneuralnetworks.InNIPS,pages1097Œ1105,2012.[14]YannLeCun,L´eonBottou,YoshuaBengio,andPatrickHaffner.Gradient-basedlearningappliedtodocumentrecognition.ProceedingsoftheIEEE,86(11):2278Œ2324,1998.[15]AndrewLMaas,AwniYHannun,DanielJurafsky,andAndrewYNg.First-passlargevo-cabularycontinuousspeechrecognitionusingbi-directionalrecurrentdnns.arXivpreprintarXiv:1408.2873,2014.', '[16]MichaelMathieu,MikaelHenaff,andYannLeCun.Fasttrainingofconvolutionalnetworksthroughffts.arXivpreprintarXiv:1312.5851,2013.', '[17]OlgaRussakovsky,JiaDeng,HaoSu,JonathanKrause,SanjeevSatheesh,SeanMa,ZhihengHuang,AndrejKarpathy,AdityaKhosla,MichaelBernstein,AlexanderC.Berg,andLiFei-Fei.Imagenetlargescalevisualrecognitionchallenge,2014.', '[18]PierreSermanet,DavidEigen,XiangZhang,Micha¨elMathieu,RobFergus,andYannLe-Cun.Overfeat:Integratedrecognition,localizationanddetectionusingconvolutionalnet-works.arXivpreprintarXiv:1312.6229,2013.', '[19]ChristianSzegedy,WeiLiu,YangqingJia,PierreSermanet,ScottReed,DragomirAnguelov,DumitruErhan,VincentVanhoucke,andAndrewRabinovich.Goingdeeperwithconvolutions.arXivpreprintarXiv:1409.4842,2014.', '[20]GuangmingTan,LinchuanLi,SeanTreichler,EverettPhillips,YungangBao,andNinghuiSun.FastimplementationofDGEMMonFermiGPU.InSupercomputing2011', ",SC'11,pages35:1Œ35:11,NewYork,NY,USA,2011.ACM.", "[21]HenryS.Warren.Hacker'sDelight.Addison-WesleyProfessional,2002."]
['1312.6186', '1312.5853']
['1310.1531', '1311.2524', '1207.0580', '1112.6209', '1106.5730', '1212.5701']
['1207.0580', '0902.1284']
[]
[]
[]
['[1]DavidDonoho.Compressedsensing.IEEETrans.Info.Theory,52(4):1289Œ1306,2006.[2]T.DietterichandG.Bakiri.Solvingmulticlasslearningproblemsviaerror-correctingoutputcodes.JournalofArticialIntelligenceResearch,2:263Œ286,1995.', '[3]R.RifkinandA.Klautau.Indefenseofone-vs-allclassication.JournalofMachineLearningResearch,5:101Œ141,2004.', '[4]M.Boutell,J.Luo,X.Shen,andC.Brown.Learningmulti-labelsceneclassication.PatternRecognition,37(9):1757Œ1771,2004.', '[5]A.ClareandR.D.King.Knowledgediscoveryinmulti-labelphenotypedata.InEuropeanConferenceonPrinciplesofDataMiningandKnowledgeDiscovery,2001.', '[6]B.Taskar,C.Guestrin,andD.Koller.Max-marginmarkovnetworks.InNIPS,2003.', '[7]N.Cesa-Bianchi,C.Gentile,andL.Zaniboni.Incrementalalgorithmsforhierarchicalclassication.JournalofMachineLearningResearch,7:31Œ54,2006.', '[8]I.Tsochantaridis,T.Hofmann,T.Joachims,andY.Altun.Supportvectormachinelearningforinterdependentandstructuredoutputspaces.InICML,2004.', '[9]J.Rousu,C.Saunders,S.Szedmak,andJ.Shawe-Taylor.Kernel-basedlearningofhierarchicalmultilabelclassicationmodels.JournalofMachineLearningResearch,7:1601Œ1626,2006.[10]J.Huang,T.Zhang,andD.Metaxax.Learningwithstructuredsparsity.InICML,2009.', '[11]G.Tsoumakas,I.Katakis,andI.Vlahavas.Effectiveandefcientmultilabelclassicationindomainswithlargenumberoflabels.InProc.ECML/PKDD2008WorkshoponMiningMultidimensionalD', 'ata,2008.', '[12]ErinAllwein,RobertSchapire,andYoramSinger.Reducingmulticlasstobinary:Aunifyingapproachformarginclassiers.JournalofMachineLearningResearch,1:113Œ141,2000.', '[13]J.LangfordandA.Beygelzimer.Sensitiveerrorcorrectingoutputcodes.InProc.ConferenceonLearningTheory,2005.', '[14]EmmanuelCand˚es,JustinRomberg,andTerrenceTao.Stablesignalrecoveryfromincompleteandinaccuratemeasurements.Comm.PureAppl.Math.,59:1207Œ122,2006.[15]R.DeVore.Deterministicconstructionsofcompressedsensingmatrices.J.ofComplexity,23:918Œ925,2007.', '[16]ShaharMendelson,AlainPajor,andNicoleTomczak-Jaegermann.UniformuncertaintyprincipleforBernoulliandsubgaus-sianensembles.ConstructiveApproximation,28(3):277Œ289,2008.', '[17]M.RudelsonandR.Vershynin.Sparsereconstructionbyconvexrelaxation:FourierandGaussianmeasurements.InProc.ConferenceonInformationSciencesandSystems,2006.', '[18]S.MallatandZ.Zhang.Matchingpursuitswithtime-frequencydictionaries.IEEETransactionsonSignalProcessing,41(12):3397Œ3415,1993.', '[19]TongZhang.Adaptiveforward-backwardgreedyalgorithmforsparselearningwithlinearmodels.InProc.NeuralInforma-tionProcessingSystems,2008.', '[20]D.NeedellandJ.A.Tropp.CoSaMP:Iterativesignalrecoveryfromincompleteandinaccuratesamples.AppliedandCom-putationalHarmonicAnalysis,2007.', '[21]BradleyEfron,TrevorHastie,IainJohnstone,andRobertTibshirani.Leastangleregression.AnnalsofStatistics,32(2):407Œ499,2004.', '[22]ShamM.Kakade,KarthikSridharan,andAmbujTewari.Onthecomplexityoflinearprediction:Riskbounds,marginbounds,andregularization.InProc.NeuralInformationProcessingSystems,2008.', '[23]AndrewNg.Featureselection,l1vs.l2regularization,androtationalinvariance.InICML,2004.', '[24]DavidDonoho,MichaelElad,andVladimirTemlyakov.Stablerecoveryofsparseovercompleterepresentationsinthepres-enceofnoise.IEEETrans.Info.Theory,52(1):6Œ18,2006.', '[25]SanjoyDasgupta.LearningProbabilityDistributions.PhDthesis,UniversityofCalifornia,2000.', '[26]LuisvonAhnandLauraDabbish.Labelingimageswithacomputergame.InProc.ACMConferenceonHumanFactorsinComputingSystems,2004.', '[27]MarcinMarszaek,CordeliaSchmid,HediHarzallah,andJoostvandeWeijer.Learningobjectrepresentationsforvisualobjectclassrecognition.InVisualRecognitionChallangeWorkshop,inconjunctionwithICCV,2007.', '[28]HerbertBay,AndreasEss,TinneTuytelaars,andLucVanGool.SURF:Speededuprobustfeatures.ComputerVisionandImageUnderstanding,110(3):346Œ359,2008.']
['Ando,R.andZhang,T.Aframeworkforlearningpredictivestructuresfrommultipletasksandunlabeleddata.JMLR,6,2005.', 'Argyriou,Andreas,Evgeniou,Theodoros,andPontil,Massimil-iano.Multi-taskfeaturelearning.InNIPS,2006.', 'Bay,H.,Tuytelaars,T.,andGool,L.Van.SURF:Speededuprobustfeatures.InECCV,2006.', 'DeCAF:ADeepConvolutionalActivationFeatureforGenericVisualRecognitionsualrecognitionchallenge2012.2012.URL', 'http://www.image-net.org/challenges/LSVRC/2012/', '.Berg,T.andBelhumeur,P.POOF:Part-basedone-vs-onefeaturesforcategorization,facevandattributeestimation.InCVPR,2013.', 'Bo,L.,Ren,X.,andFox,D.Kerneldescriptorsforvisualrecog-nition.InNIPS,2010.', 'Bourdev,Lubomir,Maji,Subhransu,andMalik,Jitendra.De-scribingpeople:Aposelet-basedapproachtoattributecation.InICCV,2011.', 'Caruana,R.Multitasklearning.MachineLearning,28,1997.', 'Chopra,S.,Balakrishnan,S.,andGopalan,R.Dlid:Deeplearn-ingfordomainadaptationbyinterpolatingbetweendomains.InICMLWorkshoponChallengesinRepresentationLearning,2013.', 'Dalal,N.andTriggs,B.Histogramsoforientedgradientsforhumandetection.InCVPR,2005.', 'DaumeIII,H.Frustratinglyeasydomainadaptation.InACL,2007.', 'Deng,J.,Dong,W.,Socher,R.,Li,L.,Li,K.,andFei-Fei,L.ImageNet:ALarge-ScaleHierarchicalImageDatabase.InCVPR,2009.', 'Fei-Fei,L.,Fergus,R.,andPerona,P.Learninggenerativevisualmodelsfromfewtrainingexamples:anincrementalBayesianapproachtestedon101objectcategories.InCVPR,2004.', 'Felzenszwalb,P.,Girshick,R.,McAllester,D.,andRamanan,D.Objectdetectionwithdiscriminativelytrainedpart-basedmod-els.PAMI,32,2010.', 'Fidler,S.andLeonardis,A.Towardsscalablerepresentationsofobjectcategories:Learningahierarchyofparts.InCVPR,2007.', 'Gong,B.,Shi,Y.,Sha,F.,andGrauman,K.Geodesicwkernelforunsuperviseddomainadaptation.InCVPR,2012.', 'Hinton,G.andSalakhutdinov,R.Reducingthedimensionalityofdatawithneuralnetworks.Science,2006.', 'Hinton,G.,Srivastava,N.,Krizhevsky,A.,Sutskever,I.,andSalakhutdinov,R.Improvingneuralnetworksbypre-ventingco-adaptationoffeaturedetectors.arXivpreprintarXiv:1207.0580,2012.', 'Hoffman,J.,Rodner,E.,Donahue,J.,Saenko,K.,andDarrell,T.Eflearningofdomain-invariantimagerepresentations.InICLR,2013.', 'Hsu,D.,Kakade,S.,Langford,J.,andZhang,T.Multi-labelpredictionviacompressedsensing.arXivpreprintarXiv:0902.1284,2009.', 'Jarrett,K.,Kavukcuoglu,K.,Ranzato,M.,andLeCun,Y.Whatisthebestmulti-stagearchitectureforobjectrecognition?InICCV,2009.', 'Kennedy,L.andHauptmann,A.LSCOMlexiconandannotations(version1.0).2006.', 'Krizhevsky,A.,Sutskever,I.,andHinton,G.E.ImageNetclas-withdeepconvolutionalneuralnetworks.InNIPS,2012.', 'Kulis,B.,Saenko,K.,andDarrell,T.Whatyousawisnotwhatyouget:Domainadaptationusingasymmetrickerneltrans-forms.InCVPR,2011.', 'Le,Q.,Zou,W.,Yeung,S.,andNg,A.Learninghierarchicalinvariantspatio-temporalfeaturesforactionrecognitionwithindependentsubspaceanalysis.InCVPR,2011.', 'Le,Q.,Ranzato,M.,Monga,R.,Devin,M.,Chen,K.,Corrado,G.,Dean,J.,andNg,A.Buildinghigh-levelfeaturesusinglargescaleunsupervisedlearning.InICML,2012.', 'LeCun,Y.,Boser,B.,Denker,J.,Henderson,D.,Howard,R.,Hubbard,W.,andJackel,L.Backpropagationappliedtohand-writtenzipcoderecognition.NeuralComputation,1989.', 'LeCun,Y.,Bottou,L.,Bengio,Y.,andHaffner,P.Gradient-basedlearningappliedtodocumentrecognition.InIEEE,1998.', 'Li,L.,Su,H.,Fei-Fei,L.,andXing,E.Objectbank:Ahigh-levelimagerepresentationforscene&semanticfeatureInNIPS,2010.', 'Mesnil,G.,Dauphin,Y.,Glorot,X.,Rifai,S.,Bengio,Y.,Good-fellow,I.,Lavoie,E.,Muller,X.,Desjardins,G.,Warde-Farley,D.,Vincent,P.,Courville,A.,andBerkgstra,J.Unsupervisedandtransferlearningchallenge:adeeplearningapproach.JMLR,27,2012.', 'Oliva,A.andTorralba,A.Modelingtheshapeofthescene:Aholisticrepresentationofthespatialenvelope.IJCV,2001.', 'Quattoni,A.,Collins,M.,andDarrell,T.Transferlearningforimageclassicationwithsparseprototyperepresentations.InCVPR,2008.', 'Raina,R.at,Battle,A.,Lee,H.,Packer,B.,andNg,A.Self-taughtlearning:Transferlearningfromunlabeleddata.InICML,2007.', 'Ren,X.andRamanan,D.Histogramsofsparsecodesforobjectdetection.InCVPR,2013.', 'Saenko,K.,Kulis,B.,Fritz,M.,andDarrell,T.Adaptingvisualcategorymodelstonewdomains.InECCV,2010.', 'Singh,S.,Gupta,A.,andEfros,A.Unsuperviseddiscoveryofmid-leveldiscriminativepatches.InECCV,2012.', 'Thrun,S.Islearningthen-ththinganyeasierthanlearningtheInNIPS,1996.', 'Torralba,A.andEfros,A.Unbiasedlookatdatasetbias.InCVPR,2011.', 'Torresani,L.,Szummer,M.,andFitzgibbon,A.Efobjectcategoryrecognitionusingclassemes.InECCV.2010.', 'vanderMaaten,L.andHinton,G.Visualizingdatausingt-sne.JMLR,9,2008.', 'Wang,J.,Yang,J.,Yu,K.,Lv,F.,Huang,T.,andGong,Y.Locality-constrainedlinearcodingforimageInCVPR,2010.', 'DeCAF:ADeepConvolutionalActivationFeatureforGenericVisualRecognitionWelinder,P.,Branson,S.,Mita,T.,Wah,C.,Schroff,F.,Belongie,S.,andPerona,P.Caltech-UCSDBirds200.TechnicalReportCNS-TR-2010-001,CaliforniaInstituteofTechnology,2010.', 'Xiao,J.,Hays,J.,Ehinger,K.,Oliva,A.,andTorralba,A.Sundatabase:Large-scalescenerecognitionfromabbeytozoo.InCVPR,2010.', 'Yang,J.,L.,Y.,Tian,Y.,Duan,L.,andGao,W.Group-sensitivemultiplekernellearningforobjectcategorization.InICCV,2009.', 'Zhang,N.,Farrell,R.,Iandola,F.,andDarrell,T.Deformablepartdescriptorsforrecognitionandattributepre-diction.InICCV,2013.', 'Zhu,L.,Chen,Y.,andYuille,A.Unsupervisedlearningofaprob-abilisticgrammarforobjectdetectionandparsing.InNIPS,2007.']
[]
['[1]B.Alexe,T.Deselaers,andV.Ferrari.Measuringtheobject-nessofimagewindows.TPAMI,2012.', '2[2]P.Arbel´aez,B.Hariharan,C.Gu,S.Gupta,L.Bourdev,andJ.Malik.Semanticsegmentationusingregionsandparts.InCVPR,2012.', '10,11[3]P.Arbel´aez,J.Pont-Tuset,J.Barron,F.Marques,andJ.Ma-lik.Multiscalecombinatorialgrouping.InCVPR,2014.', '3[4]J.Carreira,R.Caseiro,J.Batista,andC.Sminchisescu.Se-manticsegmentationwithsecond-orderpooling.InECCV,2012.', '4,10,11,13,14[5]J.CarreiraandC.Sminchisescu.CPMC:Automaticob-jectsegmentationusingconstrainedparametricmin-cuts.TPAMI,2012.', '2,3[6]D.Cires¸an,A.Giusti,L.Gambardella,andJ.Schmidhu-ber.Mitosisdetectioninbreastcancerhistologyimageswithdeepneuralnetworks.InMICCAI,2013.', '3[7]N.DalalandB.Triggs.Histogramsoforientedgradientsforhumandetection.InCVPR,2005.', '1[8]T.Dean,M.A.Ruzon,M.Segal,J.Shlens,S.Vijaya-narasimhan,andJ.Yagnik.Fast,accuratedetectionof100,000objectclassesonasinglemachine.InCVPR,2013.', '3[9]J.Deng,A.Berg,S.Satheesh,H.Su,A.Khosla,andL.Fei-Fei.ImageNetLargeScaleVisualRecognitionCompetition2012(ILSVRC2012).', 'http://www.image-net.org/challenges/LSVRC/2012/', '.1[10]J.Deng,W.Dong,R.Socher,L.-J.Li,K.Li,andL.Fei-Fei.ImageNet:Alarge-scalehierarchicalimagedatabase.InCVPR,2009.', '1[11]J.Deng,O.Russakovsky,J.Krause,M.Bernstein,A.C.Berg,andL.Fei-Fei.Scalablemulti-labelannotation.InCHI,2014.', '8[12]J.Donahue,Y.Jia,O.Vinyals,J.Hoffman,N.Zhang,E.Tzeng,andT.Darrell.DeCAF:ADeepConvolutionalActivationFeatureforGenericVisualRecognition.InICML,2014.', '2[13]M.Douze,H.J´egou,H.Sandhawalia,L.Amsaleg,andC.Schmid.Evaluationofgistdescriptorsforweb-scaleim-agesearch.InProc.oftheACMInternationalConferenceonImageandVideoRetrieval,2009.', '13[14]I.EndresandD.Hoiem.Categoryindependentobjectpro-posals.InECCV,2010.', '3[15]M.Everingham,L.VanGool,C.K.I.Williams,J.Winn,andA.Zisserman.ThePASCALVisualObjectClasses(VOC)Challenge.IJCV,2010.', '1,4[16]C.Farabet,C.Couprie,L.Najman,andY.LeCun.Learninghierarchicalfeaturesforscenelabeling.TPAMI,2013.', '10[17]P.Felzenszwalb,R.Girshick,D.McAllester,andD.Ra-manan.Objectdetectionwithdiscriminativelytrainedpartbasedmodels.TPAMI,2010.', '2,4,7,12[18]S.Fidler,R.Mottaghi,A.Yuille,andR.Urtasun.Bottom-upsegmentationfortop-downdetection.InCVPR,2013.', '4,5[19]K.Fukushima.Neocognitron:Aself-organizingneu-ralnetworkmodelforamechanismofpatternrecogni-tionunaffectedbyshiftinposition.Biologicalcybernetics,36(4):193Œ202,1980.', '1[20]R.Girshick,P.Felzenszwalb,andD.McAllester.Discrimi-nativelytraineddeformablepartmodels,release5.http://www.cs.berkeley.edu/Ÿrbg/latent-v5/.2,5,6,7[21]C.Gu,J.J.Lim,P.Arbel´aez,andJ.Malik.Recognitionusingregions.InCVPR,2009.', '2[22]B.Hariharan,P.Arbel´aez,L.Bourdev,S.Maji,andJ.Malik.Semanticcontoursfrominversedetectors.InICCV,2011.', '10[23]D.Hoiem,Y.Chodpathumwan,andQ.Dai.Diagnosingerrorinobjectdetectors.InECCV.2012.', '2,7,8[24]Y.Jia.Caffe:Anopensourceconvolutionalarchi-tectureforfastfeatureembedding.http://caffe.berkeleyvision.org/,2013.', '3[25]A.Krizhevsky,I.Sutskever,andG.Hinton.ImageNetclas-withdeepconvolutionalneuralnetworks.InNIPS,2012.', '1,3,4,7[26]Y.LeCun,B.Boser,J.Denker,D.Henderson,R.Howard,W.Hubbard,andL.Jackel.Backpropagationappliedtohandwrittenzipcoderecognition.NeuralComp.,1989.', '1[27]Y.LeCun,L.Bottou,Y.Bengio,andP.Haffner.Gradient-basedlearningappliedtodocumentrecognition.Proc.oftheIEEE,1998.', '1[28]J.J.Lim,C.L.Zitnick,andP.Doll´ar.Sketchtokens:Alearnedmid-levelrepresentationforcontourandobjectde-tection.InCVPR,2013.', 'classAPclassAPclassAPclassAPclassAPaccordion50.8centipede30.4hairspray13.8pencilbox11.4snowplow69.2airplane50.0chainsaw14.1hamburger34.2pencilsharpener9.0soapdispenser16.8ant31.8chair19.5hammer9.9perfume32.8soccerball43.7antelope53.8chime24.6hamster46.0person41.7sofa16.3apple30.9cocktailshaker46.2harmonica12.6piano20.5spatula6.8armadillo54.0coffeemaker21.5harp50.4pineapple22.6squirrel31.3artichoke45.0computerkeyboard39.6hatwithawidebrim40.5ping-pongball21.045.1axe11.8computermouse21.2headcabbage17.4pitcher19.2stethoscope18.3babybed42.0corkscrew24.2helmet33.4pizza43.7stove8.1backpack2.8cream29.9hippopotamus38.0plasticbag6.4strainer9.9bagel37.5croquetball30.0horizontalbar7.0platerack15.2strawberry26.8balancebeam32.6crutch23.7horse41.7pomegranate32.0stretcher13.2banana21.9cucumber22.8hotdog28.7popsicle21.2sunglasses18.8bandaid17.4cupormug34.0iPod59.2porcupine37.2swimmingtrunks9.1banjo55.3diaper10.1isopod19.5powerdrill7.9swine45.3baseball41.8digitalclock18.523.7pretzel24.8syringe5.7basketball65.3dishwasher19.9koalabear44.3printer21.3table21.7bathingcap37.2dog76.8ladle3.0puck14.1tapeplayer21.4beaker11.3domesticcat44.1ladybug58.4punchingbag29.4tennisball59.1bear62.727.8lamp9.1purse8.0tick42.6bee52.9drum19.9laptop35.4rabbit71.0tie24.6bellpepper38.8dumbbell14.1lemon33.3racket16.2tiger61.8bench12.7electricfan35.0lion51.3ray41.1toaster29.2bicycle41.1elephant56.4lipstick23.1redpanda61.1traflight24.7binder6.2facepowder22.1lizard38.9refrigerator14.0train60.8bird70.944.5lobster32.4remotecontrol41.6trombone13.8bookshelf19.3cabinet20.6maillot31.0rubbereraser2.5trumpet14.4bowtie38.8werpot20.2maraca30.1rugbyball34.5turtle59.1bow9.04.9microphone4.0ruler11.5tvormonitor41.7bowl26.7fox59.3microwave40.1saltorpeppershaker24.6unicycle27.2brassiere31.2frenchhorn24.2milkcan33.3saxophone40.8vacuum19.5burrito25.7frog64.1miniskirt14.9scorpion57.3violin13.7bus57.5fryingpan21.5monkey49.6screwdriver10.6volleyball59.7b88.5giantpanda42.5motorcycle42.2seal20.9wafiron24.0camel37.628.6mushroom31.8sheep48.9washer39.8canopener28.9golfball51.3nail4.5ski9.0waterbottle8.1car44.5golfcart47.9neckbrace31.6skunk57.9watercraft40.9cart48.0guacamole32.3oboe27.5snail36.2whale48.6cattle32.3guitar33.1orange38.8snake33.8winebottle31.2cello28.9hairdryer13.0otter22.2snowmobile58.8zebra49.6Table8:Per-classaverageprecision(%)ontheILSVRC2013detectiontestset.', '[29]D.Lowe.Distinctiveimagefeaturesfromscale-invariantkeypoints.IJCV,2004.', '1[30]A.OlivaandA.Torralba.Modelingtheshapeofthescene:Aholisticrepresentationofthespatialenvelope.IJCV,2001.', 'objectdetection.InCVPR,2013.', '6,7[32]H.A.Rowley,S.Baluja,andT.Kanade.Neuralnetwork-basedfacedetection.TPAMI,1998.', '2[33]D.E.Rumelhart,G.E.Hinton,andR.J.Williams.Learn-inginternalrepresentationsbyerrorpropagation.ParallelDistributedProcessing,1:318Œ362,1986.', '1[34]P.Sermanet,D.Eigen,X.Zhang,M.Mathieu,R.Fergus,andY.LeCun.OverFeat:IntegratedRecognition,Localiza-tionandDetectionusingConvolutionalNetworks.InICLR,2014.', '1,2,4,10[35]P.Sermanet,K.Kavukcuoglu,S.Chintala,andY.LeCun.Pedestriandetectionwithunsupervisedmulti-stagefeaturelearning.InCVPR,2013.', '2[36]H.Su,J.Deng,andL.Fei-Fei.Crowdsourcingannotationsforvisualobjectdetection.InAAAITechnicalReport,4thHumanComputationWorkshop,2012.', '8[37]K.SungandT.Poggio.Example-basedlearningforview-basedhumanfacedetection.TechnicalReportA.I.MemoNo.1521,MassachussetsInstituteofTechnology,1994.4[38]C.Szegedy,A.Toshev,andD.Erhan.Deepneuralnetworksforobjectdetection.InNIPS,2013.', '2[39]J.Uijlings,K.vandeSande,T.Gevers,andA.Smeulders.Selectivesearchforobjectrecognition.IJCV,2013.', '1,2,3,4,5,9[40]R.Vaillant,C.Monrocq,andY.LeCun.Originalapproachforthelocalisationofobjectsinimages.IEEProconVision,Image,andSignalProcessing,1994.', '2[41]X.Wang,M.Yang,S.Zhu,andY.Lin.Regionletsforgenericobjectdetection.InICCV,2013.', '3,5[42]M.Zeiler,G.Taylor,andR.Fergus.Adaptivedeconvolu-tionalnetworksformidandhighlevelfeaturelearning.InCVPR,2011.', 'Figure12:Weshowthe24regionproposals,outoftheapproximately10millionregionsinVOC2007test,thatmoststrongly']
[]
[]
[]
['Bengio,Y.andLeCun,Y.Scalinglearningalgorithmsto-wardsAI.InLarge-ScaleKernelMachines,2007.', 'Bengio,Y.,Lamblin,P.,Popovici,D.,andLarochelle,H.Greedylayerwisetrainingofdeepnetworks.InNIPS,2007.', 'Berkes,P.andWiskott,L.Slowfeatureanalysisyieldsarichrepertoireofcomplexcellproperties.JournalofVision,2005.', 'Buildinghigh-levelfeaturesusinglarge-scaleunsupervisedlearningSchmidhuber,J.Deepbigsimpleneuralnetsexcelonhandwrittendigitrecognition.CoRR,2010.', 'Coates,A.,Lee,H.,andNg,A.Y.Ananalysisofsingle-layernetworksinunsupervisedfeaturelearning.InAIS-TATS14,2011.', 'Deng,J.,Dong,W.,Socher,R.,Li,L.-J.,Li,K.,andFei-Fei,L.ImageNet:ALarge-ScaleHierarchicalImageDatabase.InCVPR,2009.', 'Deng,J.,Berg,A.,Li,K.,andFei-Fei,L.Whatdoesclassifyingmorethan10,000imagecategoriestellus?InECCV,2010.', 'Desimone,R.,Albright,T.,Gross,C.,andBruce,C.Stimulus-selectivepropertiesofinferiortemporalneu-ronsinthemacaque.TheJournalofNeuroscience,1984.', 'DiCarlo,J.J.,Zoccolan,D.,andRust,N.C.Howdoesthebrainsolvevisualobjectrecognition?Neuron,2012.', 'Erhan,D.,Bengio,Y.,Courville,A.,andVincent,P.Visu-alizinghigher-layerfeaturesofdeepnetworks.Technicalreport,UniversityofMontreal,2009.', 'Fukushima,K.andMiyake,S.Neocognitron:Anewal-gorithmforpatternrecognitiontolerantofdeformationsandshiftsinposition.PatternRecognition,1982.', 'Gregor,K.andLeCun,Y.Emergenceofcomplex-likecellsinatemporalproductnetworkwithlocalreceptivearXiv:1006.0448,2010.', 'Hinton,G.E.andSalakhutdinov,R.R.Reducingthedi-mensionalityofdatawithneuralnetworks.Science,2006.', 'Hinton,G.E.,Osindero,S.,andTeh,Y.W.Afastlearn-ingalgorithmfordeepbeliefnets.NeuralComputation,2006.', 'Huang,G.B.,Ramesh,M.,Berg,T.,andLearned-Miller,E.Labeledfacesinthewild:Adatabaseforstudyingfacerecognitioninunconstrainedenvironments.Techni-calReport07-49,UniversityofMassachusetts,Amherst,October2007.', "Hubel,D.H.andWiesel,T.N.Receptiveofsingleneuronsinthethecat'svisualcortex.JournalofPhys-iology,1959.", 'arinen,A.,Hurri,J.,andHoyer,P.O.NaturalImageStatistics.Springer,2009.', 'Jarrett,K.,Kavukcuoglu,K.,Ranzato,M.A.,andLeCun,Y.Whatisthebestmulti-stagearchitectureforobjectrecognition?InICCV,2009.', 'Keller,C.,Enzweiler,M.,andGavrila,D.M.Anewbench-markforstereo-basedpedestriandetection.InProc.oftheIEEEIntelligentVehiclesSymposium,2009.', 'Krizhevsky,A.Learningmultiplelayersoffeaturesfromtinyimages.Technicalreport,UniversityofToronto,2009.', 'Le,Q.V.,Ngiam,J.,Chen,Z.,Chia,D.,Koh,P.W.,andNg,A.Y.Tiledconvolutionalneuralnetworks.InNIPS,2010.', 'Le,Q.V.,Karpenko,A.,Ngiam,J.,andNg,A.Y.ICAwithReconstructionCostfortOvercompleteFeatureLearning.InNIPS,2011a.', 'Le,Q.V.,Ngiam,J.,Coates,A.,Lahiri,A.,Prochnow,B.,andNg,A.Y.Onoptimizationmethodsfordeeplearning.InICML,2011b.', 'LeCun,Y.,Bottou,L.,Bengio,Y.,andP.Gra-dientbasedlearningappliedtodocumentrecognition.ProceedingoftheIEEE,1998.', 'Lee,H.,Battle,A.,Raina,R.,andNg,AndrewY.tsparsecodingalgorithms.InNIPS,2007.', 'Lee,H.,Ekanadham,C.,andNg,A.Y.SparsedeepbeliefnetmodelforvisualareaV2.InNIPS,2008.', 'Lee,H.,Grosse,R.,Ranganath,R.,andNg,A.Y.Convo-lutionaldeepbeliefnetworksforscalableunsupervisedlearningofhierarchicalrepresentations.InICML,2009.', 'Lyu,S.andSimoncelli,E.P.Nonlinearimagerepresenta-tionusingdivisivenormalization.InCVPR,2008.', 'Olshausen,B.andField,D.Emergenceofsimple-cellre-ceptivepropertiesbylearningasparsecodefornat-uralimages.Nature,1996.', 'Pakkenberg,B.,P.,D.,Marner,L.,Bundgaard,M.J.,Gundersen,H.J.G.,Nyengaard,J.R.,andRegeur,L.Agingandthehumanneocortex.ExperimentalGeron-tology,2003.', 'Pinto,N.,Cox,D.D.,andDiCarlo,J.J.Whyisreal-worldvisualobjectrecognitionhard?PLoSComputationalBiology,2008.', 'Quiroga,R.Q.,Reddy,L.,Kreiman,G.,Koch,C.,andFried,I.Invariantvisualrepresentationbysingleneu-ronsinthehumanbrain.Nature,2005.', 'Raina,R.,Battle,A.,Lee,H.,Packer,B.,andNg,A.Y.Self-taughtlearning:Transferlearningfromunlabelleddata.InICML,2007.', 'Raina,R.,Madhavan,A.,andNg,A.Y.Large-scaledeepunsupervisedlearningusinggraphicsprocessors.InICML,2009.', 'Ranzato,M.,Huang,F.J,Boureau,Y.,andLeCun,Y.Un-supervisedlearningofinvariantfeaturehierarchieswithapplicationstoobjectrecognition.InCVPR,2007.', 'Riesenhuber,M.andPoggio,T.Hierarchicalmodelsofobjectrecognitionincortex.NatureNeuroscience,1999.', 'Sanchez,J.andPerronnin,F.High-dimensionalsigna-turecompressionforlarge-scaleInCVPR,2011.', 'Sermanet,P.andLeCun,Y.Tsignrecognitionwithmultiscaleconvolutionalneuralnetworks.InIJCNN,2011.', 'Weston,J.,Bengio,S.,andUsunier,N.Wsabie:Scalinguptolargevocabularyimageannotation.InIJCAI,2011.', 'Zhang,W.,Sun,J.,andTang,X.Catheaddetection-howtoelyexploitshapeandtexturefeatures.InECCV,2008.', 'Buildinghigh-levelfeaturesusinglarge-scaleunsupervisedlearningA.TrainingandtestimagesAsubsetoftrainingimagesisshowninFigure7.Ascanbeseen,thepositions,scales,orientationsoffacesinthedatasetarediverse.AsubsetoftestimagesforFigure7.Thirtyrandomly-selectedtrainingimages(shownbeforethewhiteningstep).identifyingthefaceneuronisshowninFigure8.Figure8.Someexampletestsetimages(shownbeforethewhiteningstep).B.ModelsCentraltoourapproachinthispaperistheuseoflocally-connectednetworks.Inthesenetworks,neu-ronsonlyconnecttoalocalregionofthelayerbelow.InFigure9,weshowtheconnectivitypatternsoftheneuralnetworkarchitecturedescribedinthepaper.Theactualimagesintheexperimentsare2D,butforsimplicity,ourimagesinthevisualizationarein1D.Figure9.Diagramofthenetworkweusedwithmorede-tailedconnectivitypatterns.Colorarrowsmeanthatweightsonlyconnecttoonlyonemap.Darkarrowsmeanthatweightsconnecttoallmaps.PoolingneuronsonlyconnecttoonemapwhereassimpleneuronsandLCNneu-ronsconnecttoallmaps.C.ModelParallelismWeusemodelparallelismtodistributethestorageofparametersandgradientcomputationstotma-chines.InFigure10,weshowhowtheweightsaredividedandstoredint\\partitions,"ormoresimply,machines(seealso(Krizhevsky,2009', "Buildinghigh-levelfeaturesusinglarge-scaleunsupervisedlearningFigure13.Histogramsofneuron'sactivationvaluesforthebestfaceneurononthetestset.Red:thehistogramforfaceimages.Blue:thehistogramforrandomdistractors.Figure14.Histogramsforthebesthumanbodyneurononthetestset.Red:thehistogramforhumanbodyimages.Blue:thehistogramforrandomdistractors.I.MostresponsivestimuliforcatsandhumanbodiesInFigure16,weshowthemostresponsivestimuliforcatandhumanbodyneuronsonthetestsets.Notethat,thetopstimuliforthehumanbodyneuronareblackandwhiteimagesbecausethetestsetimagesareblackandwhite(Kelleretal.,2009"]
[]
['[1]wprobleminstancesinvision.Fromhttp://vision.csd.uwo.ca/data/maxflow/.[2]K.Asanovicandetal.Thelandscapeofparallelcomputingresearch:Aviewfromberkeley.TechnicalReportUCB/EECS-2006-183,ElectricalEngineeringandComputerSciences,UniversityofCalifornia', 'atBerkeley,2006.', '[3]D.P.Bertsekas.NonlinearProgramming.AthenaScienBelmont,MA,2ndedition,1999.', '[4]D.P.BertsekasandJ.N.Tsitsiklis.ParallelandDistributedComputation:NumericalMethods.AthenaScienBelmont,MA,1997.', '[5]L.BottouandO.Bousquet.Theoflargescalelearning.InAdvancesinNeuralInformationProcessingSystems,2008.', '[6]Y.BoykovandV.Kolmogorov.Anexperimentalcomparisonofwalgorithmsforenergyminimizationinvision.IEEETransactionsonPatternAnalysisandMachineIntelligence,26(9):1124{1137,2004.[7]G.alinescu,H.andY.Rabani.Animprovedapproximationalgorithmformultiwaycut.InProceedingsofthethirtiethannualACMSymposiumonTheoryofComputing,pages48{52,1998.', '[8]E.CandesandB.Recht.Exactmatrixcompletionviaconvexoptimization.FoundationsofComputa-tionalMathematics,9(6):717{772,2009.', '[9]J.DeanandS.Ghemawat.MapReduce:dataprocessingonlargeclusters.CommunicationsoftheACM,51(1):107{113,2008.', '[10]O.Dekel,R.Gilad-Bachrach,O.Shamir,andL.Xiao.Optimaldistributedonlinepredictionusingmini-batches.Technicalreport,MicrosoftResearch,2011.', '[11]A.Doan.http://dblife.cs.wisc.edu.[12]J.Duchi,A.Agarwal,andM.J.Wainwright.Distributeddualaveraginginnetworks.InAdvancesinNeuralInformationProcessingSystems,2010.', '[13]S.H.FullerandL.I.Millett,editors.TheFutureofComputingPerformance:GameOverorNextLevel.CommitteeonSustainingGrowthinComputingPerformance.TheNationalAcademiesPress,Washington,D.C.,2011.', '[14]T.Joachims.Traininglinearsvmsinlineartime.InProceedingsoftheACMConferenceonKnowledgeDiscoveryandDataMining(KDD),2006.', '[15]J.Langford.https://github.com/JohnLangford/vowpal_wabbit/wiki.[16]J.Langford,A.J.Smola,andM.Zinkevich.Slowlearnersarefast.InAdvancesinNeuralInformationProcessingSystems,2009.', '[17]J.Lee,,B.Recht,N.Srebro,R.R.Salakhutdinov,andJ.A.Tropp.Practicallarge-scaleoptimizationformax-normregularization.InAdvancesinNeuralInformationProcessingSystems,2010.', '[18]T.Lee,Z.Wang,H.Wang,andS.Hwang.Webscaleentityresolutionusingrelationalevidence.Technicalreport,MicrosoftResearch,2011.Availableat', '[19]D.Lewis,Y.Yang,T.Rose,andF.Li.RCV1:Anewbenchmarkcollectionfortextcategorizationresearch.JournalofMachineLearningResearch,5:361{397,2004.', '[20]Z.Q.LuoandP.Tseng.Analysisofanapproximategradientprojectionmethodwithapplicationstothebackpropagationalgorithm.OptimizationMethodsandSoftware,4:85{101,1994.', '[21]S.Melnik,A.Gubarev,J.J.Long,G.Romer,S.Shivakumar,M.Tolton,andT.Vassilakis.Dremel:Interactiveanalysisofweb-scaledatasets.InProceedingsofVLDB,2010.', '[22]A.NedicandD.P.Bertsekas.Convergencerateofincrementalsubgradientalgorithms.InS.UryasevandP.M.Pardalos,editors,StochasticOptimization:AlgorithmsandApplications,pages263{304.KluwerAcademicPublishers,2000.', '[23]A.Nemirovski,A.Juditsky,G.Lan,andA.Shapiro.Robuststochasticapproximationapproachtostochasticprogramming.SIAMJournalonOptimization,19(4):1574{1609,2009.[24]B.Recht,M.Fazel,andP.Parrilo.Guaranteedminimumranksolutionsofmatrixequationsvianuclearnormminimization.SIAMReview,52(3):471{501,2010.', '[25]B.RechtandC.Re.Parallelstochasticgradientalgorithmsforlarge-scalematrixcompletion.Submittedforpublication.Preprintavailableathttp://pages.cs.wisc.edu/~brecht/publications.html,2011.', '[26]S.Shalev-ShwartzandN.Srebro.SVMOptimization:Inversedependenceontrainingsetsize.InProceedingsofthe25thInternationConferenceonMachineLearning,2008.', '[27]N.Srebro,J.Rennie,andT.Jaakkola.Maximummarginmatrixfactorization.InAdvancesinNeuralInformationProcessingSystems,2004.', '[28]P.Tseng.Anincrementalgradient(-projection)methodwithmomentumtermandadaptivestepsizerule.SIAMJouralonOptimization,8(2):506{531,1998.', '[29]J.Tsitsiklis,D.P.Bertsekas,andM.Athans.Distributedasynchronousdeterministicandstochasticgradientoptimizationalgorithms.IEEETransactionsonAutomaticControl,31(9):803{812,1986.', '[30]M.Zinkevich,M.Weimer,A.Smola,andL.Li.Parallelizedstochasticgradientdescent.AdvancesinNeuralInformationProcessingSystems,2010.']
[]
[]
['[1]A.AgarwalandJ.C.Duchi.Distributeddelayedstochasticoptimization.InDecisionandControl(CDC),2012IEEE51stAnnualConferenceon', ',pages5451Œ5452.IEEE,2012.', '[2]J.Bergstra,O.Breuleux,F.Bastien,P.Lamblin,R.Pascanu,G.Desjardins,J.Turian,D.Warde-Farley,andY.Bengio.Theano:acpuandgpumathexpressioncompiler.InProceedingsofthePythonforComputingConference(SciPy),volume4,2010.', '[3]D.Ciresan,U.Meier,andJ.Schmidhuber.Multi-columndeepneuralnetworksforimageInComputerVisionandPatternRecognition(CVPR),2012IEEEConferenceon', ',pages3642Œ3649.IEEE,2012.', '[4]A.Coates,B.Huval,T.Wang,D.Wu,B.Catanzaro,andN.Andrew.Deeplearningwithcotshpcsystems.InProceedingsofthe30thInternationalConferenceonMachineLearning(ICML-13),pages1337Œ1345,2013.', '[5]R.Collobert,K.Kavukcuoglu,andC.Farabet.Torch7:Amatlab-likeenvironmentformachinelearning.InBigLearn,NIPSWorkshop,2011.', '[6]J.Dean,G.Corrado,R.Monga,K.Chen,M.Devin,Q.Le,M.Mao,M.Ranzato,A.Senior,P.Tucker,K.Yang,andA.Ng.Largescaledistributeddeepnetworks.InP.Bartlett,F.Pereira,C.Burges,L.Bottou,andK.Weinberger,editors,AdvancesinNeuralInformationProcessingSystems25,pages1232Œ1240.2012.', '[7]J.Deng,W.Dong,R.Socher,L.-J.Li,K.Li,andL.Fei-Fei.Imagenet:Alarge-scalehierarchicalimagedatabase.InComputerVisionandPatternRecognition,2009.CVPR2009.IEEEConferenceon', ',pages248Œ255.IEEE,2009.', '[8]J.Donahue,Y.Jia,O.Vinyals,J.Hoffman,N.Zhang,E.Tzeng,andT.Darrell.Decaf:Adeepconvolu-tionalactivationfeatureforgenericvisualrecognition.arXivpreprintarXiv:1310.1531,2013.', '[9]J.Duchi,E.Hazan,andY.Singer.Adaptivesubgradientmethodsforonlinelearningandstochasticoptimization.TheJournalofMachineLearningResearch,999999:2121Œ2159,2011.[10]R.Girshick,J.Donahue,T.Darrell,andJ.Malik.Richfeaturehierarchiesforaccurateobjectdetectionandsemanticsegmentation.arXivpreprintarXiv:1311.2524,2013.', '[11]G.E.Hinton,N.Srivastava,A.Krizhevsky,I.Sutskever,andR.R.Salakhutdinov.Improvingneuralnetworksbypreventingco-adaptationoffeaturedetectors.arXivpreprintarXiv:1207.0580,2012.', '[12]A.Krizhevsky,I.Sutskever,andG.Hinton.Imagenetwithdeepconvolutionalneuralnet-works.InAdvancesinNeuralInformationProcessingSystems25,pages1106Œ1114,2012.[13]Q.V.Le,M.Ranzato,R.Monga,M.Devin,K.Chen,G.S.Corrado,J.Dean,andA.Y.Ng.Buildinghigh-levelfeaturesusinglargescaleunsupervisedlearning.arXivpreprintarXiv:1112.6209,2011.', '[14]V.NairandG.E.Hinton.linearunitsimproverestrictedboltzmannmachines.InProceedingsofthe27thInternationalConferenceonMachineLearning(ICML-10),pages807Œ814,2010.', '[15]F.Niu,B.Recht,C.R´e,andS.J.Wright.Hogwild!:Alock-freeapproachtoparallelizingstochasticgradientdescent.arXivpreprintarXiv:1106.5730,2011.', '[16]R.Raina,A.Madhavan,andA.Y.Ng.Large-scaledeepunsupervisedlearningusinggraphicsprocessors.InICML,volume9,pages873Œ880,2009.', '[17]M.D.Zeiler.Adadelta:Anadaptivelearningratemethod.arXivpreprintarXiv:1212.5701,2012.', '[18]M.D.ZeilerandR.Fergus.VisualizingandUnderstandingConvolutionalNetworks.ArXive-prints,Nov.2013.']
[]
['[1]AlexKrizhevsky,IlyaSutskever,andGeoffreyHinton.ImageNetwithdeepconvolutionalneuralnetworks.InNIPS,2012.', '[2]J.Deng,W.Dong,R.Socher,L.J.Li,K.Li,andL.Fei-Fei.Imagenet:alarge-scalehierarchicalimagedatabase.InCVPR,2009.', 'Figure2:Diagramofagenericdeepnetwork.Thenumberofarrowsisproportionaltothesizeofthemini-batch.Figure3:DiagramofagenericdeepnetworkusingtwoGPUs(dataparallelism).EachGPUcomputeserrorsandgradientsforhalfofthesamplesinthemini-batch.ParametersandgradientsarecommunicatedacrossGPUsusingPCI-e.ThelayerscomputedonaGPUallsharethesamecolorinthediagram.[3]Y.LeCun,L.Bottou,Y.Bengio,andP.Haffner.Gradient-basedlearningappliedtodocumentrecognition.ProceedingsoftheIEEE,86(11):2278Œ2324,1998.[4]C.Szegedy,A.Toshev,andD.Erhan.Deepneuralnetworksforobjectdetection.InNIPS,2013.', '[5]C.Farabet,L.Couprie,C.amdNajman,andY.LeCun.Learninghierarchicalfeaturesforscenelabeling.IEEETransactionsonPatternAnalysisandMachineIntelligence,2013.', '[6]O.Abdel-Hamid,A.R.Mohamed,H.Jiang,andG.Penn.Applyingconvolutionalneuralnetworkscon-ceptstohybridnn-hmmmodelforspeechrecognition.InICASSP,2012.', '[7]T.Sainath,A.R.Kingsbury,Mohamed,G.Dahl,G.Saon,H.Soltau,T.Beran,A.Aravkin,andB.Ram-abhadran.Improvementstodeepconvolutionalneuralnetworksforlvcsr.InASRU,2013.', '[8]R.Collobert,J.Weston,L.Bottou,M.Karlen,K.Kavukcuoglu,andP.Kuksa.Naturallanguageprocess-ing(almost)fromscratch.JournalofMachineLearningResearch,12:2493Œ2537,2011.[9]J.Dean,G.Corrado,R.Monga,K.Chen,M.Devin,Q.Le,M.Mao,M.Ranzato,A.Senior,P.Tucker,K.Yang,andA.Ng.Largescaledistributeddeepnetworks.InNIPS,2012.', '[10]S.Zhang,C.Zhang,Z.You,R.Zheng,andB.Xu.Asynchronousstochasticgradientdescentfordnntraining.InICASSP,2013.', '[11]X.Chen,A.Eversole,G.Li,D.Yu,andF.Seide.Pipelinedback-propagationforcontext-dependentdeepneuralnetworks.InInterspeech,2012.', '[12]A.Coates,B.Huval,T.Wang,D.J.Wu,A.Y.Ng,andB.Catanzaro.Deeplearningwithcotshpc.InICML,2013.']
['AdamCoates,BrodyHuval,TaoWang,DavidWu,BryanCatanzaro,andNgAndrew.Deeplearn-ingwithcotshpcsystems.InProceedingsofThe30thInternationalConferenceonMachineLearn-ing,pages45,2013.', "Je˙reyDean,GregCorrado,RajatMonga,KaiChen,MatthieuDevin,QuocVLe,MarkZMao,Marc'AurelioRanzato,AndrewWSenior,PaulATucker,etal.Largescaledistributeddeepnet-works.InNIPS,pages12322012.JiaDeng,WeiDong,RichardSocher,Li-JiaLi,KaiLi,andLiFei-Fei.Imagenet:Alarge-scalehi-erarchicalimagedatabase.InComputerVisionandPatternRecognition,2009.CVPR2009.IEEE", 'Conferenceon,pages55.IEEE,2009.', 'AlexKrizhevsky,IlyaSutskever,andGeo˙reyEHin-ton.Imagenetclassi˝cationwithdeepconvolu-tionalneuralnetworks.InNIPS,volume1,page4,2012.', 'FengNiu,BenjaminRecht,ChristopherRé,andStephenJWright.Hogwild!:Alock-freeapproachtoparallelizingstochasticgradientdescent.Ad-vancesinNeuralInformationProcessingSystems,2011.', 'ThomasPaine,HailinJin,JianchaoYang,ZheLin,andThomasHuang.Gpuasynchronousstochasticgradientdescenttospeedupneuralnetworktrain-ing.arXivpreprintarXiv:1312.6186,2013.', "OmryYadan,KeithAdams,YanivTaigman,andMarc'AurelioRanzato.Multi-gputrainingofcon-vnets.arXivpreprintarXiv:1312.5853,2013."]
[]
[]
['1207.4708', '1409.1556', '1409.4842']
[]
['Bellemare,M.,Veness,J.,&Bowling,M.(2012).Investigatingcontingencyawarenessusing', 'Atari2600games.InProceedingsofthethe26thConferenceonAIntelligence(AAAI).Browne,C.B.,Powley,E.,Whitehouse,D.,Lucas,S.M.,Cowling,P.I.,Rohlfshagen,P.,Tavener,S.,Perez,D.,Samothrakis,S.,&Colton,S.(2012).AsurveyofMonteCarlo', 'treesearchmethods.IEEETransactionsonComputationalIntelligenceandAIinGames,4(1),1{43.Cobo,L.C.,Zang,P.,Isbell,C.L.,&Thomaz,A.L.(2011).Automaticstateabstraction', 'fromdemonstration.InProceedingsofthe22ndSecondInternationalJointConferenceonArticialIntelligence(IJCAI).Coles,A.,Coles,A.,Olaya,A.,Jimenez,S.,opez,C.,Sanner,S.,&Yoon,S.(2012).A', 'surveyoftheseventhinternationalplanningcompetition.AIMagazine,33(1),83{88.Diuk,C.,Cohen,A.,&Littman,M.L.(2008).Anobject-orientedrepresentationfor', 'cientreinforcementlearning.InProceedingsofthe25thInternationalConferenceonMachinelearning(ICML).Dowe,D.L.,&Hajek,A.R.(1998).Anon-behavioural,computationalextensiontothe', 'TuringTest.InProceedingsoftheInternationalConferenceonComputationalIntel-ligenceandMultimediaApplications(ICCIMA).Genesereth,M.R.,Love,N.,&Pell,B.(2005).GeneralGamePlaying:Overviewofthe', 'AAAIcompetition.AIMagazine,26(2),62{72.Gionis,A.,Indyk,P.,&Motwani,R.(1999).Similaritysearchinhighdimensionsvia', 'hashing.InProceedingsoftheInternationalConferenceonVeryLargeDatabases.Hausknecht,M.,Khandelwal,P.,Miikkulainen,R.,&Stone,P.(2012).HyperNEAT-GGP:', 'AHyperNEAT-basedAtarigeneralgameplayer.InProceedingsoftheGeneticandEvolutionaryComputationConference(GECCO).andez-Orallo,J.,&Dowe,D.L.(2010).Measuringuniversalintelligence:Towardsan', 'anytimeintelligencetest.AIntelligence,174(18),1508{1539.andez-Orallo,J.,&Minaya-Collado,N.(1998).Aformalofintelligence', 'basedonanintensionalvariantofKolmogorovcomplexity.InProceedingsoftheInternationalSymposiumofEngineeringofIntelligentSystems(EIS).Hutter,M.(2005).', 'UniversalAIntelligence:SequentialDecisionsbasedonAlgorith-micProbability.Springer,Berlin.Kanerva,P.(1988).', 'SparseDistributedMemory.TheMITPress.Kocsis,L.,&Szepari,C.(2006).BanditbasedMonte-Carloplanning.In', 'Proceedingsofthe15thEuropeanConferenceonMachineLearning(ECML).Legg,S.(2008).', 'MachineSuperIntelligence.Ph.D.thesis,UniversityofLugano.Legg,S.,&Veness,J.(2011).Anapproximationoftheuniversalintelligencemeasure.In', 'TheArcadeLearningEnvironment:AnEvaluationPlatformforGeneralAgentsMohan,S.,&Laird,J.E.(2009).LearningtoplayMario.Tech.rep.CCA-TR-2009-03,', 'CenterforCognitiveArchitecture,UniversityofMichigan.Monroy,G.A.,Stanley,K.O.,&Miikkulainen,R.(2006).Coevolutionofneuralnetworks', 'usingalayeredparetoarchive.InProceedingsofthe8thGeneticandEvolutionaryComputationConference(GECCO).Montfort,N.,&Bogost,I.(2009).', 'RacingtheBeam:TheAtariVideoComputerSystem.MITPress.Naddaf,Y.(2010).Game-IndependentAIAgentsforPlayingAtari2600ConsoleGames.', "Master'sthesis,UniversityofAlberta.Pell,B.(1993).", 'StrategyGenerationandEvaluationforMeta-GamePlaying.Ph.D.thesis,UniversityofCambridge.Pierce,D.,&Kuipers,B.(1997).Maplearningwithuninterpretedsensorsandtors.', 'AIntelligence,92(1-2),169{227.Russell,S.J.(1997).Rationalityandintelligence.', 'Aintelligence,94(1),57{77.Schaul,T.,Togelius,J.,&Schmidhuber,J.(2011).Measuringintelligencethroughgames.', 'CoRR,abs/1109.1314.Schweitzer,P.J.,&Seidmann,A.(1985).GeneralizedpolynomialapproximationsinMarko-', 'viandecisionprocesses.Journalofmathematicalanalysisandapplications,110(2),568{582.Stober,J.,&Kuipers,B.(2008).Frompixelstopolicies:Abootstrappingagent.In', 'Proceedingsofthe7thIEEEInternationalConferenceonDevelopmentandLearning(ICDL).Sutton,R.S.,&Barto,A.G.(1998).', 'ReinforcementLearning:AnIntroduction.TheMITPress.Sutton,R.,Modayil,J.,Delp,M.,Degris,T.,Pilarski,P.,White,A.,&Precup,D.(2011).', 'Horde:Ascalablereal-timearchitectureforlearningknowledgefromunsupervisedsensorimotorinteraction.InProceedingsofthe10thInternationalConferenceonAu-tonomousAgentsandMultiagentsSystems(AAMAS).Thrun,S.,&Mitchell,T.M.(1995).Lifelongrobotlearning.', 'RoboticsandAutonomousSystems,15(1),25{46.Watkins,C.,&Dayan,P.(1992).Q-learning.', 'MachineLearning,8,279{292.Whiteson,S.,Tanner,B.,Taylor,M.E.,&Stone,P.(2011).Protectingagainstevaluation', 'ovinempiricalreinforcementlearning.InProceedingsoftheIEEESymposiumonAdaptiveDynamicProgrammingandReinforcementLearning(ADPRL).Whiteson,S.,Tanner,B.,&White,A.(2010).Thereinforcementlearningcompetitions.', 'AIMagazine,31(2),81{94.Wintermute,S.(2010).Usingimagerytosimplifyperceptualabstractioninreinforcement']
[]
[]
[]
['[1]Knowyourmeme:Weneedtogodeeper.http://knowyourmeme.com/memes/we-need-to-go-deeper.Accessed:2014-09-15.', '[2]SanjeevArora,AdityaBhaskara,RongGe,andTengyuMa.Provableboundsforlearningsomedeeprepresentations.CoRR,abs/1310.6343,2013.[3]¨UmitV.C¸ataly¨urek,CevdetAykanat,andBoraUc¸ar.Ontwo-dimensionalsparsematrixpar-titioning:Models,methods,andarecipe.SIAMJ.Sci.Comput.,32(2):656Œ683,February2010.', "[4]JeffreyDean,GregCorrado,RajatMonga,KaiChen,MatthieuDevin,MarkMao,Marc'aurelioRanzato,AndrewSenior,PaulTucker,KeYang,QuocV.Le,andAndrewY.Ng.Largescaledistributeddeepnetworks.InP.Bartlett,F.c.n.Pereira,C.j.c.Burges,L.Bot-tou,andK.q.Weinberger,editors,AdvancesinNeuralInformationProcessingSystems25,pages1232Œ1240.2012.[5]DumitruErhan,ChristianSzegedy,AlexanderToshev,andDragomirAnguelov.Scalableob-jectdetectionusingdeepneuralnetworks.InComputerVisionandPatternRecognition,2014.", 'CVPR2014.IEEEConferenceon', ',2014.', '[6]RossB.Girshick,JeffDonahue,TrevorDarrell,andJitendraMalik.Richfeaturehierarchiesforaccurateobjectdetectionandsemanticsegmentation.InComputerVisionandPatternRecognition,2014.CVPR2014.IEEEConferenceon', ',2014.', '[7]GeoffreyE.Hinton,NitishSrivastava,AlexKrizhevsky,IlyaSutskever,andRuslanSalakhut-dinov.Improvingneuralnetworksbypreventingco-adaptationoffeaturedetectors.CoRR,abs/1207.0580,2012.[8]AndrewG.Howard.SomeimprovementsondeepconvolutionalneuralnetworkbasedimageCoRR,abs/1312.5402,2013.[9]AlexKrizhevsky,IlyaSutskever,andGeoffHinton.Imagenetwithdeepcon-volutionalneuralnetworks.InAdvancesinNeuralInformationProcessingSystems25,pages1106Œ1114,2012.[10]Y.LeCun,B.Boser,J.S.Denker,D.Henderson,R.E.Howard,W.Hubbard,andL.D.Jackel.Backpropagationappliedtohandwrittenzipcoderecognition.NeuralComput.,1(4):541Œ551,December1989.', '[11]YannLeCun,L´eonBottou,YoshuaBengio,andPatrickHaffner.Gradient-basedlearningappliedtodocumentrecognition.ProceedingsoftheIEEE,86(11):2278Œ2324,1998.[12]MinLin,QiangChen,andShuichengYan.Networkinnetwork.CoRR,abs/1312.4400,2013.[13]B.T.PolyakandA.B.Juditsky.Accelerationofstochasticapproximationbyaveraging.SIAMJ.ControlOptim.,30(4):838Œ855,July1992.', '[15]ThomasSerre,LiorWolf,StanleyM.Bileschi,MaximilianRiesenhuber,andTomasoPoggio.Robustobjectrecognitionwithcortex-likemechanisms.IEEETrans.PatternAnal.Mach.Intell.,29(3):411Œ426,2007.', "[16]FengguangSongandJackDongarra.Scalingupmatrixcomputationsonshared-memorymanycoresystemswith1000cpucores.InProceedingsofthe28thACMInternationalCon-ferenceonSupercomputing,ICS'14,pages333Œ342,NewYork,NY,USA,2014.ACM.", '[17]IlyaSutskever,JamesMartens,GeorgeE.Dahl,andGeoffreyE.Hinton.Ontheimportanceofinitializationandmomentumindeeplearning.InProceedingsofthe30thInternationalConferenceonMachineLearning,ICML2013,Atlanta,GA,USA,16-21June2013', ',volume28ofJMLRProceedings,pages1139Œ1147.JMLR.org,2013.[18]ChristianSzegedy,AlexanderToshev,andDumitruErhan.Deepneuralnetworksforobjectdetection.InChristopherJ.C.Burges,L´eonBottou,ZoubinGhahramani,andKilianQ.Weinberger,editors,AdvancesinNeuralInformationProcessingSystems26:27thAnnualConferenceonNeuralInformationProcessingSystems2013.Proceedingsofameetingheld', 'December5-8,2013,LakeTahoe,Nevada,UnitedStates.', ',pages2553Œ2561,2013.[19]AlexanderToshevandChristianSzegedy.Deeppose:Humanposeestimationviadeepneuralnetworks.CoRR,abs/1312.4659,2013.[20]KoenE.A.vandeSande,JasperR.R.Uijlings,TheoGevers,andArnoldW.M.Smeulders.Segmentationasselectivesearchforobjectrecognition.InProceedingsofthe2011Interna-', "tionalConferenceonComputerVision,ICCV'11,pages1879Œ1886,Washington,DC,USA,2011.IEEEComputerSociety.", '[21]MatthewD.ZeilerandRobFergus.Visualizingandunderstandingconvolutionalnetworks.InDavidJ.Fleet,Tom´asPajdla,BerntSchiele,andTinneTuytelaars,editors,ComputerVision-ECCV2014-13thEuropeanConference,Zurich,Switzerland,September6-12,2014,Pro-', 'ceedings,PartI,volume8689ofLectureNotesinComputerScience,pages818Œ833.Springer,2014.']
['Bellemare,MarcG,Naddaf,Yavar,Veness,Joel,andBowling,Michael.Thearcadelearningenvironment:Anevaluationplatformforgeneralagents.arXivpreprintarXiv:1207.4708,2012.', 'Coates,Adam,Huval,Brody,Wang,Tao,Wu,David,Catanzaro,Bryan,andAndrew,Ng.Deeplearningwithcotshpcsystems.InProceedingsofThe30thInterna-tionalConferenceonMachineLearning,pp.1337Œ1345,2013.', 'Dahl,GeorgeE,Yu,Dong,Deng,Li,andAcero,Alex.Context-dependentpre-traineddeepneuralnetworksforlarge-vocabularyspeechrecognition.Audio,Speech,andLanguageProcessing,IEEETransactionson,20(1):30Œ42,2012.', 'Dean,Jeffrey,Corrado,Greg,Monga,Rajat,Chen,Kai,Devin,Matthieu,Mao,Mark,Senior,Andrew,Tucker,Paul,Yang,Ke,Le,QuocV,etal.Largescaledistributeddeepnetworks.InAdvancesinNeuralInformationPro-cessingSystems,pp.1223Œ1231,2012.Duchi,John,Hazan,Elad,andSinger,Yoram.Adaptivesubgradientmethodsforonlinelearningandstochasticoptimization.TheJournalofMachineLearningRe-search,12:2121Œ2159,2011.Graves,Alex,Mohamed,A-R,andHinton,Geoffrey.Speechrecognitionwithdeeprecurrentneuralnetworks.InAcoustics,SpeechandSignalProcessing(ICASSP),2013IEEEInternationalConferenceon', ',pp.6645Œ6649.IEEE,2013.', 'Grounds,MatthewandKudenko,Daniel.Parallelrein-forcementlearningwithlinearfunctionapproximation.InProceedingsofthe5th,6thand7thEuropeanConfer-enceonAdaptiveandLearningAgentsandMulti-agentSystems:AdaptationandMulti-agentLearning,pp.60Œ74.Springer-Verlag,2008.', 'Krizhevsky,Alex,Sutskever,Ilya,andHinton,Geoff.Im-agenetwithdeepconvolutionalneuralnet-works.InAdvancesinNeuralInformationProcessingSystems25,pp.1106Œ1114,2012.Lauer,MartinandRiedmiller,Martin.Analgorithmfordistributedreinforcementlearningincooperativemulti-agentsystems.InInProceedingsoftheSeventeenthIn-ternationalConferenceonMachineLearning,pp.535Œ542.MorganKaufmann,2000.', 'Li,YuxiandSchuurmans,Dale.Mapreduceforparallelre-inforcementlearning.InRecentAdvancesinReinforce-mentLearning-9thEuropeanWorkshop,EWRL2011,', 'Athens,Greece,September9-11,2011,RevisedSelected', 'Papers,pp.309Œ320,2011.', 'Lin,Long-Ji.Reinforcementlearningforrobotsusingneu-ralnetworks.Technicalreport,DTICDocument,1993.', 'Mnih,Volodymyr,Kavukcuoglu,Koray,Silver,David,Graves,Alex,Antonoglou,Ioannis,Wierstra,Daan,andRiedmiller,Martin.Playingatariwithdeepreinforce-mentlearning.InNIPSDeepLearningWorkshop.2013.', 'Mnih,Volodymyr,Kavukcuoglu,Koray,Silver,David,Rusu,AndreiA.,Veness,Joel,Bellemare,MarcG.,Graves,Alex,Riedmiller,Martin,Fidjeland,AndreasK.,Ostrovski,Georg,Petersen,Stig,Beattie,Charles,Sadik,Amir,Antonoglou,Ioannis,King,Helen,Kumaran,Dharshan,Wierstra,Daan,Legg,Shane,andHassabis,Demis.Human-levelcontrolthroughdeepreinforcementlearning.Nature,518(7540):529Œ533,022015.URLhttp://dx.doi.org/10.1038/nature14236.Silver,David,Newnham,Leonard,Barker,David,Weller,Suzanne,andMcFall,Jason.Concurrentreinforcementlearningfromcustomerinteractions.InProceedingsofthe30thInternationalConferenceonMachineLearning,pp.924Œ932,2013.', 'Simonyan,KarenandZisserman,Andrew.Verydeepcon-volutionalnetworksforlarge-scaleimagerecognition.arXivpreprintarXiv:1409.1556,2014.', 'Sutton,R.andBarto,A.ReinforcementLearning:anIn-troduction.MITPress,1998.', 'Szegedy,Christian,Liu,Wei,Jia,Yangqing,Sermanet,Pierre,Reed,Scott,Anguelov,Dragomir,Erhan,Du-mitru,Vanhoucke,Vincent,andRabinovich,Andrew.Goingdeeperwithconvolutions.arXivpreprintarXiv:1409.4842,2014.', 'Tsitsiklis,J.andRoy,B.Van.Ananalysisoftemporal-differencelearningwithfunctionapproximation.IEEETransactionsonAutomaticControl,42(5):674Œ690,1997.', 'Weiss,Gerhard.Distributedreinforcementlearning.15:135Œ142,1995.', 'AppendixJuly17,2015', 'MassivelyParallelMethodsforDeepReinforcementLearningTable3.RAWDATA-HUMANSTARTSGamesRandomHumanDQNGorilaAvgAlien128.306371.30570.20813.54Amidar11.801540.40133.40189.15Assault166.90628.903332.301195.85Asterix164.507536.00124.503324.70Asteroids877.1036517.30697.10933.63Atlantis13463.0026575.0076108.00629166.50BankHeist21.70644.50176.30399.42BattleZone3560.0033030.0017560.0019938.00', 'MassivelyParallelMethodsforDeepReinforcementLearningTable4.RAWDATA-NULLOPGamesRandomHumanDQNGorilaAvgAlien227.806875.403069.302620.53Amidar5.801675.80739.501189.70Assault222.401496.403358.601450.41Asterix210.008503.306011.706433.33Asteroids719.1013156.701629.301047.66Atlantis12850.0029028.1085950.00100069.16BankHeist14.20734.40429.70609.00BattleZone2360.0037800.0026300.0025266.66BeamRider363.905774.706845.903302.91Bowling23.10154.8042.4054.01Boxing0.104.3071.8094.88Breakout1.7031.80401.20402.20Centipede2090.9011963.208309.408432.30ChopperCommand811.009881.806686.704167.50CrazyClimber10780.5035410.50114103.3085919.16DemonAttack152.103401.309711.2013693.12DoubleDunk-18.60-15.50-18.10-10.62Enduro0.00309.60301.80114.90FishingDerby-91.705.50-0.8020.19Freeway0.0029.6030.3011.69Frostbite65.204334.70328.30605.16Gopher257.602321.008520.005279.00Gravitar173.002672.00306.701054.58Hero1027.0025762.5019950.30', '14913.87IceHockey-11.200.90-1.60-0.61JamesBond29.00406.70576.70605.00Kangaroo52.003035.006740.002549.16Krull1598.002394.603804.707882.00KungFuMaster258.5022736.2023270.0027543.33MontezumaRevenge0.004366.700.004.16MsPacman307.3015693.402311.003233.50NameThisGame2292.304076.207256.706182.16Pong-20.709.3018.9018.30PrivateEye24.9069571.301787.60748.60QBert163.9013455.0010595.8010815.55RiverRaid1338.5013513.308315.708344.83RoadRunner11.507845.0018256.7051007.99Robotank2.2011.9051.6036.43Seaquest68.4020181.805286.0013169.06SpaceInvaders148.001652.301975.50']
['1406.1231', '1409.4842', '1409.4842']
[]
['Ajay,W.PatrickWalters,andMarkA.Murcko.Canwelearntodistinguishbetweendrug-likeandnondrug-likemolecules?JournalofMedicinalChemistry,41(18):3314{3324,1998.doi:10.1021/jm970666c.URL', 'http://pubs.acs.org/doi/abs/10.1021/jm970666c.FrankR.Burden,MartynG.Ford,DavidC.Whitley,andDavidA.Winkler.Useofautomaticrelevancedeterminationinqsarstudiesusingbayesianneuralnetworks.JournalofChemicalInformationandComputerSciences,40(6):1423{1430,2000.doi:10.1021/ci000450a.URLhttp://pubs.acs.org/doi/abs/10.1021/ci000450a.FrankR.andBurden.Robustqsarmodelsusingbayesianregularizedneuralnetworks.JournalofMedicinalChemistry,42(16):3183{3187,1999.doi:10.1021/jm980697n.URL', 'http://pubs.acs.org/doi/abs/10.1021/jm980697n.2013.08.', 'MolecularOperatingEnvironment(MOE).ChemicalComputingGroupInc.,1010SherbookeSt.West,Suite910,Montreal,QC,Canada,H3A2R7,2013.', 'RonanCollobertandJasonWeston.AArchitectureforNaturalLanguagePro-cessing:DeepNeuralNetworkswithMultitaskLearning.InProceedingsofthe25thInternationalConferenceonMachineLearning(ICML-2008)', ',pages160{167,2008.', 'JamesDevillers.NeuralnetworksinQSARanddrugdesign.AcademicPress,1996.', "HongyingDu,JieWang,ZhideHu,XiaojunYao,andXiaoyunZhang.Predictionoffungicidalactivitiesofriceblastdiseasebasedonleast-squaressupportvectormachinesandprojectpursuitregression.JournalofAgriculturalandFoodChemistry,56(22):10785{10792,2008.doi:10.1021/jf8022194.URLhttp://pubs.acs.org/doi/abs/10.1021/jf8022194.DumitruErhan,Pierre-JeanL'Heureux,ShiYiYue,andYoshuaBengio.Collaborativeonafamilyofbiologicaltargets.J.Chem.Inf.Model.,46(2):626{635,2006.", 'URLhttp://dx.doi.org/10.1021/ci050367t.DumitruErhan,AaronCourville,YoshuaBengio,andPascalVincent.Whydoesunsu-pervisedpre-traininghelpdeeplearning?InProceedingsofAISTATS2010', ',volume9,pages201{208,May2010.', 'E.Hinton,SimonOsindero,andY.W.Teh.Afastlearningalgorithmfordeepbeliefnets.NeuralComputation,18:1527{1554,2006.E.Hinton,LiDeng,DongYu,GeorgeE.Dahl,Abdel-rahmanMohamed,NavdeepJaitly,AndrewSenior,VincentVanhoucke,PatrickNguyen,TaraN.Sainath,andBrianKingsbury.Deepneuralnetworksforacousticmodelinginspeechrecognition:Thesharedviewsoffourresearchgroups.SignalProcessingMagazine,29(6):82{97,2012a.', 'E.Hinton,NitishSrivastava,AlexKrizhevsky,IlyaSutskever,andRuslanSalakhutdinov.Improvingneuralnetworksbypreventingco-adaptationoffeaturede-tectors.TheComputingResearchRepository(CoRR),abs/1207.0580,2012b.URLhttp://arxiv.org/abs/1207.0580.AlexKrizhevsky,IlyaSutskever,andE.Hinton.ImageNetwithdeepconvolutionalneuralnetworks.InPeterL.Bartlett,FernandoC.N.Pereira,Christo-pherJ.C.Burges,LonBottou,andKilianQ.Weinberger,editors,AdvancesinNeuralInformationProcessingSystems,pages1106{1114,2012.PeixunLiuandWeiLong.Currentmathematicalmethodsusedinqsar/qsprstudies.InternationalJournalofMolecularSciences,10(5):1978{1998,2009.ISSN1422-0067.', 'BioinformaticsandComputationalBiology(CIBCB),2012IEEESymposiumon', ',pages314{320,2012.', 'EricMartin,PrasenjitMukherjee,DavidSullivan,andJohannaJansen.anovelmeta-qsarmethodthatcombinesactivitiesacrossthekinasefamilytoaccuratelypredicty,selectivity,andcellularactivity.Journalofchemicalinformationandmodeling,51(8):1942{1956,2011.OlgaObrezanova,aboranyi,JoelleMRGola,andMatthewDSegall.Gaussianpro-cesses:amethodforautomaticqsarmodelingofadmeproperties.Journalofchemicalinformationandmodeling,47(5):1847{1857,2007.F.Pedregosa,G.Varoquaux,A.Gramfort,V.Michel,B.Thirion,O.Grisel,M.Blon-del,P.Prettenhofer,R.Weiss,V.Dubourg,J.Vanderplas,A.Passos,D.Cournapeau,M.Brucher,M.Perrot,andE.Duchesnay.Scikit-learn:MachinelearninginPython.JournalofMachineLearningResearch,12:2825{2830,2011.D.E.Rumelhart,G.E.Hinton,andR.J.Williams.Learningrepresentationsbyback-propagatingerrors.Nature,323(6088):533{536,1986.', 'HongzongSi,TaoWang,KejunZhang,Yun-BoDuan,ShupingYuan,AipingFu,andZhideHu.Quantitativestructureactivityrelationshipmodelforpredictingthedepletionpercentageofskinallergicchemicalsubstancesofglutathione.AnalyticaChimicaActa,591(2):255{264,2007.', 'JasperSnoek,HugoLarochelle,andRyanPrescottAdams.Practicalbayesianoptimizationofmachinelearningalgorithms.InAdvancesinNeuralInformationProcessingSystems,pages2960{2968,2012.JasperSnoek,KevinSwersky,RichardZemel,andRyanPrescottAdams.Inputwarpingforbayesianoptimizationofnon-stationaryfunctions.InAdvancesinNeuralInformationProcessingSystemsWorkshoponBayesianOptimization,2013.', 'VladimirSvetnik,AndyLiaw,ChristopherTong,J.ChristopherCulberson,RobertP.Sheridan,andBradleyP.Feuston.Randomforest:Acandregressiontoolforcompoundandqsarmodeling.JournalofChemicalInformationandComputerSciences,43(6):1947{1958,2003.DavidAWinkler.Theroleofquantitativestructure-activityrelationships(qsar)inbiomoleculardiscovery.inbioinformatics,3(1):73{86,2002.', 'AStochasticgradientdescentdetailsForallneuralnettraining,weusedminibatchstochasticgradientdescentwithmomentumandbackpropagation[Rumelhartetal.,1986]tocomputethenecessarygradients.Let', 'Cbethetrainingobjectivefunctionandletwbeagenericneuralnetparamete[email protected]@[email protected]tchofcases,weusedthefollowingweightupdateformulas:v(t)=v(t1)[email protected]@w˛w(t)=w(t1)+v(t);whereisthelearningrateorstepsize,isthemomentumstrength,andistheweightcoststrength.BBayesianoptimizationsearchspaceWeusedtheconstrainedversionofSpearmint[Snoeketal.,2012]withwarpingenabled']
[]
['[1]Knowyourmeme:Weneedtogodeeper.http://knowyourmeme.com/memes/we-need-to-go-deeper.Accessed:2014-09-15.', '[2]SanjeevArora,AdityaBhaskara,RongGe,andTengyuMa.Provableboundsforlearningsomedeeprepresentations.CoRR,abs/1310.6343,2013.[3]¨UmitV.C¸ataly¨urek,CevdetAykanat,andBoraUc¸ar.Ontwo-dimensionalsparsematrixpar-titioning:Models,methods,andarecipe.SIAMJ.Sci.Comput.,32(2):656Œ683,February2010.', "[4]JeffreyDean,GregCorrado,RajatMonga,KaiChen,MatthieuDevin,MarkMao,Marc'aurelioRanzato,AndrewSenior,PaulTucker,KeYang,QuocV.Le,andAndrewY.Ng.Largescaledistributeddeepnetworks.InP.Bartlett,F.c.n.Pereira,C.j.c.Burges,L.Bot-tou,andK.q.Weinberger,editors,AdvancesinNeuralInformationProcessingSystems25,pages1232Œ1240.2012.[5]DumitruErhan,ChristianSzegedy,AlexanderToshev,andDragomirAnguelov.Scalableob-jectdetectionusingdeepneuralnetworks.InComputerVisionandPatternRecognition,2014.", 'CVPR2014.IEEEConferenceon', ',2014.', '[6]RossB.Girshick,JeffDonahue,TrevorDarrell,andJitendraMalik.Richfeaturehierarchiesforaccurateobjectdetectionandsemanticsegmentation.InComputerVisionandPatternRecognition,2014.CVPR2014.IEEEConferenceon', ',2014.', '[7]GeoffreyE.Hinton,NitishSrivastava,AlexKrizhevsky,IlyaSutskever,andRuslanSalakhut-dinov.Improvingneuralnetworksbypreventingco-adaptationoffeaturedetectors.CoRR,abs/1207.0580,2012.[8]AndrewG.Howard.SomeimprovementsondeepconvolutionalneuralnetworkbasedimageCoRR,abs/1312.5402,2013.[9]AlexKrizhevsky,IlyaSutskever,andGeoffHinton.Imagenetwithdeepcon-volutionalneuralnetworks.InAdvancesinNeuralInformationProcessingSystems25,pages1106Œ1114,2012.[10]Y.LeCun,B.Boser,J.S.Denker,D.Henderson,R.E.Howard,W.Hubbard,andL.D.Jackel.Backpropagationappliedtohandwrittenzipcoderecognition.NeuralComput.,1(4):541Œ551,December1989.', '[11]YannLeCun,L´eonBottou,YoshuaBengio,andPatrickHaffner.Gradient-basedlearningappliedtodocumentrecognition.ProceedingsoftheIEEE,86(11):2278Œ2324,1998.[12]MinLin,QiangChen,andShuichengYan.Networkinnetwork.CoRR,abs/1312.4400,2013.[13]B.T.PolyakandA.B.Juditsky.Accelerationofstochasticapproximationbyaveraging.SIAMJ.ControlOptim.,30(4):838Œ855,July1992.', '[15]ThomasSerre,LiorWolf,StanleyM.Bileschi,MaximilianRiesenhuber,andTomasoPoggio.Robustobjectrecognitionwithcortex-likemechanisms.IEEETrans.PatternAnal.Mach.Intell.,29(3):411Œ426,2007.', "[16]FengguangSongandJackDongarra.Scalingupmatrixcomputationsonshared-memorymanycoresystemswith1000cpucores.InProceedingsofthe28thACMInternationalCon-ferenceonSupercomputing,ICS'14,pages333Œ342,NewYork,NY,USA,2014.ACM.", '[17]IlyaSutskever,JamesMartens,GeorgeE.Dahl,andGeoffreyE.Hinton.Ontheimportanceofinitializationandmomentumindeeplearning.InProceedingsofthe30thInternationalConferenceonMachineLearning,ICML2013,Atlanta,GA,USA,16-21June2013', ',volume28ofJMLRProceedings,pages1139Œ1147.JMLR.org,2013.[18]ChristianSzegedy,AlexanderToshev,andDumitruErhan.Deepneuralnetworksforobjectdetection.InChristopherJ.C.Burges,L´eonBottou,ZoubinGhahramani,andKilianQ.Weinberger,editors,AdvancesinNeuralInformationProcessingSystems26:27thAnnualConferenceonNeuralInformationProcessingSystems2013.Proceedingsofameetingheld', 'December5-8,2013,LakeTahoe,Nevada,UnitedStates.', ',pages2553Œ2561,2013.[19]AlexanderToshevandChristianSzegedy.Deeppose:Humanposeestimationviadeepneuralnetworks.CoRR,abs/1312.4659,2013.[20]KoenE.A.vandeSande,JasperR.R.Uijlings,TheoGevers,andArnoldW.M.Smeulders.Segmentationasselectivesearchforobjectrecognition.InProceedingsofthe2011Interna-', "tionalConferenceonComputerVision,ICCV'11,pages1879Œ1886,Washington,DC,USA,2011.IEEEComputerSociety.", '[21]MatthewD.ZeilerandRobFergus.Visualizingandunderstandingconvolutionalnetworks.InDavidJ.Fleet,Tom´asPajdla,BerntSchiele,andTinneTuytelaars,editors,ComputerVision-ECCV2014-13thEuropeanConference,Zurich,Switzerland,September6-12,2014,Pro-', 'ceedings,PartI,volume8689ofLectureNotesinComputerScience,pages818Œ833.Springer,2014.']
[]
['[1]Knowyourmeme:Weneedtogodeeper.http://knowyourmeme.com/memes/we-need-to-go-deeper.Accessed:2014-09-15.', '[2]SanjeevArora,AdityaBhaskara,RongGe,andTengyuMa.Provableboundsforlearningsomedeeprepresentations.CoRR,abs/1310.6343,2013.[3]¨UmitV.C¸ataly¨urek,CevdetAykanat,andBoraUc¸ar.Ontwo-dimensionalsparsematrixpar-titioning:Models,methods,andarecipe.SIAMJ.Sci.Comput.,32(2):656Œ683,February2010.', "[4]JeffreyDean,GregCorrado,RajatMonga,KaiChen,MatthieuDevin,MarkMao,Marc'aurelioRanzato,AndrewSenior,PaulTucker,KeYang,QuocV.Le,andAndrewY.Ng.Largescaledistributeddeepnetworks.InP.Bartlett,F.c.n.Pereira,C.j.c.Burges,L.Bot-tou,andK.q.Weinberger,editors,AdvancesinNeuralInformationProcessingSystems25,pages1232Œ1240.2012.[5]DumitruErhan,ChristianSzegedy,AlexanderToshev,andDragomirAnguelov.Scalableob-jectdetectionusingdeepneuralnetworks.InComputerVisionandPatternRecognition,2014.", 'CVPR2014.IEEEConferenceon', ',2014.', '[6]RossB.Girshick,JeffDonahue,TrevorDarrell,andJitendraMalik.Richfeaturehierarchiesforaccurateobjectdetectionandsemanticsegmentation.InComputerVisionandPatternRecognition,2014.CVPR2014.IEEEConferenceon', ',2014.', '[7]GeoffreyE.Hinton,NitishSrivastava,AlexKrizhevsky,IlyaSutskever,andRuslanSalakhut-dinov.Improvingneuralnetworksbypreventingco-adaptationoffeaturedetectors.CoRR,abs/1207.0580,2012.[8]AndrewG.Howard.SomeimprovementsondeepconvolutionalneuralnetworkbasedimageCoRR,abs/1312.5402,2013.[9]AlexKrizhevsky,IlyaSutskever,andGeoffHinton.Imagenetwithdeepcon-volutionalneuralnetworks.InAdvancesinNeuralInformationProcessingSystems25,pages1106Œ1114,2012.[10]Y.LeCun,B.Boser,J.S.Denker,D.Henderson,R.E.Howard,W.Hubbard,andL.D.Jackel.Backpropagationappliedtohandwrittenzipcoderecognition.NeuralComput.,1(4):541Œ551,December1989.', '[11]YannLeCun,L´eonBottou,YoshuaBengio,andPatrickHaffner.Gradient-basedlearningappliedtodocumentrecognition.ProceedingsoftheIEEE,86(11):2278Œ2324,1998.[12]MinLin,QiangChen,andShuichengYan.Networkinnetwork.CoRR,abs/1312.4400,2013.[13]B.T.PolyakandA.B.Juditsky.Accelerationofstochasticapproximationbyaveraging.SIAMJ.ControlOptim.,30(4):838Œ855,July1992.', '[15]ThomasSerre,LiorWolf,StanleyM.Bileschi,MaximilianRiesenhuber,andTomasoPoggio.Robustobjectrecognitionwithcortex-likemechanisms.IEEETrans.PatternAnal.Mach.Intell.,29(3):411Œ426,2007.', "[16]FengguangSongandJackDongarra.Scalingupmatrixcomputationsonshared-memorymanycoresystemswith1000cpucores.InProceedingsofthe28thACMInternationalCon-ferenceonSupercomputing,ICS'14,pages333Œ342,NewYork,NY,USA,2014.ACM.", '[17]IlyaSutskever,JamesMartens,GeorgeE.Dahl,andGeoffreyE.Hinton.Ontheimportanceofinitializationandmomentumindeeplearning.InProceedingsofthe30thInternationalConferenceonMachineLearning,ICML2013,Atlanta,GA,USA,16-21June2013', ',volume28ofJMLRProceedings,pages1139Œ1147.JMLR.org,2013.[18]ChristianSzegedy,AlexanderToshev,andDumitruErhan.Deepneuralnetworksforobjectdetection.InChristopherJ.C.Burges,L´eonBottou,ZoubinGhahramani,andKilianQ.Weinberger,editors,AdvancesinNeuralInformationProcessingSystems26:27thAnnualConferenceonNeuralInformationProcessingSystems2013.Proceedingsofameetingheld', 'December5-8,2013,LakeTahoe,Nevada,UnitedStates.', ',pages2553Œ2561,2013.[19]AlexanderToshevandChristianSzegedy.Deeppose:Humanposeestimationviadeepneuralnetworks.CoRR,abs/1312.4659,2013.[20]KoenE.A.vandeSande,JasperR.R.Uijlings,TheoGevers,andArnoldW.M.Smeulders.Segmentationasselectivesearchforobjectrecognition.InProceedingsofthe2011Interna-', "tionalConferenceonComputerVision,ICCV'11,pages1879Œ1886,Washington,DC,USA,2011.IEEEComputerSociety.", '[21]MatthewD.ZeilerandRobFergus.Visualizingandunderstandingconvolutionalnetworks.InDavidJ.Fleet,Tom´asPajdla,BerntSchiele,andTinneTuytelaars,editors,ComputerVision-ECCV2014-13thEuropeanConference,Zurich,Switzerland,September6-12,2014,Pro-', 'ceedings,PartI,volume8689ofLectureNotesinComputerScience,pages818Œ833.Springer,2014.']
['Abdo,Ammar,Chen,Beining,Mueller,Christoph,Salim,Naomie,andWillett,Peter.Ligand-basedvirtualscreen-ingusingbayesiannetworks.Journalofchemicalinfor-mationandmodeling,50(6):1012Œ1020,2010.Collobert,RonanandWeston,Jason.Aarchitecturefornaturallanguageprocessing:Deepneuralnetworkswithmultitasklearning.InProceedingsofthe25thinter-nationalconferenceonMachinelearning,pp.160Œ167.ACM,2008.', 'Dahl,George.DeepLearningHowIDidIt:Merck1stplaceinterview.NoFreeHunch,November1,2012.', 'Dahl,GeorgeE,Jaitly,Navdeep,andSalakhutdinov,Rus-lan.Multi-taskneuralnetworksforQSARpredictions.arXivpreprintarXiv:1406.1231,2014.', 'Deng,Li,Hinton,Geoffrey,andKingsbury,Brian.Newtypesofdeepneuralnetworklearningforspeechrecog-nitionandrelatedapplications:Anoverview.InAcoustics,SpeechandSignalProcessing(ICASSP),2013IEEEInternationalConferenceon', ',pp.8599Œ8603.IEEE,2013.', "Erhan,Dumitru,L'Heureux,Pierre-Jean,Yue,ShiYi,andBengio,Yoshua.Collaborativeonafamilyofbiologicaltargets.Journalofchemicalinformationandmodeling,46(2):626Œ635,2006.", 'Jain,AjayNandNicholls,Anthony.Recommenda-tionsforevaluationofcomputationalmethods.Journalofcomputer-aidedmoleculardesign,22(3-4):133Œ139,2008.', 'Landrum,Greg.RDKit:Open-sourcecheminformatics.URLhttp://www.rdkit.org.Lowe,Derek.DidKagglePredictDrugCandidateActivi-ties?OrNot?InthePipeline,December11,2012.', 'Lusci,Alessandro,Pollastri,Gianluca,andBaldi,Pierre.Deeparchitecturesanddeeplearninginchemoinformat-ics:thepredictionofaqueoussolubilityfordrug-likemolecules.Journalofchemicalinformationandmod-eling,53(7):1563Œ1575,2013.Ma,Junshui,Sheridan,RobertP,Liaw,Andy,Dahl,George,andSvetnik,Vladimir.Deepneuralnetsasamethodforquantitativestructure-activityrelationships.JournalofChemicalInformationandModeling,2015.', 'Mysinger,MichaelM,Carchia,Michael,Irwin,JohnJ,andShoichet,BrianK.Directoryofusefuldecoys,enhanced(DUD-E):betterligandsanddecoysforbetterbench-marking.Journalofmedicinalchemistry,55(14):6582Œ6594,2012.', 'Nair,VinodandHinton,GeoffreyE.linearunitsimproverestrictedboltzmannmachines.InProceedingsofthe27thInternationalConferenceonMachineLearn-ing(ICML-10),pp.807Œ814,2010.', 'Pedregosa,Fabian,Varoquaux,Ga¨el,Gramfort,Alexan-dre,Michel,Vincent,Thirion,Bertrand,Grisel,Olivier,Blondel,Mathieu,Prettenhofer,Peter,Weiss,Ron,Dubourg,Vincent,etal.Scikit-learn:Machinelearninginpython.TheJournalofMachineLearningResearch,12:2825Œ2830,2011.Rogers,DavidandHahn,Mathew.Extended-connectivityJournalofchemicalinformationandmod-eling,50(5):742Œ754,2010.', 'Rohrer,SebastianGandBaumann,Knut.Maximumun-biasedvalidation(MUV)datasetsforvirtualscreeningbasedonpubchembioactivitydata.Journalofchemicalinformationandmodeling,49(2):169Œ184,2009.', 'Rumelhart,DavidE,Hinton,GeoffreyE,andWilliams,RonaldJ.Learningrepresentationsbyback-propagatingerrors.Cognitivemodeling,1988.', 'Shoichet,BrianK.Virtualscreeningofchemicallibraries.Nature,432(7019):862Œ865,2004.', 'Swamidass,SJoshua,Azencott,Chlo´e-Agathe,Lin,Ting-Wan,Gramajo,Hugo,Tsai,Shiou-Chuan,andBaldi,Pierre.relevancevoting:anaccurateandinter-pretablevirtualhighthroughputscreeningmethod.Jour-nalofchemicalinformationandmodeling,49(4):756Œ766,2009.', 'Szegedy,Christian,Liu,Wei,Jia,Yangqing,Sermanet,Pierre,Reed,Scott,Anguelov,Dragomir,Erhan,Du-mitru,Vanhoucke,Vincent,andRabinovich,Andrew.Goingdeeperwithconvolutions.arXivpreprintarXiv:1409.4842,2014.', "Unterthiner,Thomas,Mayr,Andreas,¨unterKlambauer,G,Steijaert,Marvin,Wenger,J¨org,Ceulemans,Hugo,andHochreiter,Sepp.Deeplearningasanopportunityinvirtualscreening.Varnek,AlexandreandBaskin,Igor.Machinelearningmethodsforpropertypredictioninchemoinformatics:quovadis?Journalofchemicalinformationandmodel-ing,52(6):1413Œ1437,2012.Wang,Yanli,Xiao,Jewen,Suzek,TugbaO,Zhang,Jian,Wang,Jiyao,Zhou,Zhigang,Han,Lianyi,Karapetyan,Karen,Dracheva,Svetlana,Shoemaker,BenjaminA,etal.PubChem'sBioAssaydatabase.Nucleicacidsre-search,40(D1):D400ŒD412,2012.", 'MassivelyMultitaskNetworksforDrugDiscoveryWillett,Peter,Barnard,JohnM,andDowns,GeoffreyM.Chemicalsimilaritysearching.Journalofchemicalin-formationandcomputersciences,38(6):983Œ996,1998.', 'MassivelyMultitaskNetworksforDrugDiscovery:AppendixFebruary10,2015', 'MassivelyMultitaskNetworksforDrugDiscoveryDatasetActivesInactivesTargetClassTargetpcba-aid887102472140otherenzyme15hLO-1pcba-aid89115487836otherenzymeCYP2D6pcba-aid89918097575otherenzymeCYP2C19pcba-aid902*1872123512viabilityH1299-p53A138Vpcba-aid903*33854175viabilityH1299-neopcba-aid904*52853981viabilityH1299-neopcba-aid91244568506miscellaneousanthraxLF-PAinternalizationpcba-aid91421810619transcriptionfac-torHIF-1pcba-aid91543610401transcriptionfac-torHIF-1pcba-aid924*1146122867viabilityH1299-p53A138Vpcba-aid9253964358miscellaneousEGFP-654pcba-aid92635071666GPCRTSHRpcba-aid927*6159108proteaseUSP2apcba-aid938177570241ionchannelCNGpcba-aid995*69970189signallingpath-wayERK1/2cascadepcba-aid103015963200920', 'otherenzymeALDH1A1pcba-aid1379*562198500', 'otherenzymeluciferasepcba-aid1452177151634otherenzyme12hLOpcba-aid1454*536130788signallingpath-wayERK1/2cascadepcba-aid1457722204859otherenzymeIMPasepcba-aid14585805202680miscellaneousSMN2pcba-aid1460*5662261757protein-proteininteractionK18pcba-aid14612305218561GPCRNPSRpcba-aid1468*1039270371protein-proteininteractionK18pcba-aid1469169276098protein-proteininteractionTRb-SRC2pcba-aid1471288223321protein-proteininteractionhuntingtinpcba-aid1479788275479miscellaneousTRb-SRC2pcba-aid1631892262774otherenzymehPK-M2pcba-aid1634154263512otherenzymehPK-M2pcba-aid16882374218200protein-proteininteractionHTTQ103pcba-aid17211087291649otherenzymeLmPKpcba-aid2100*1159301145otherenzymealpha-glucosidasepcba-aid2101*285321268otherenzymeglucocerebrosidasepcba-aid21473477223441otherenzymeJMJD2Epcba-aid2242*715198459', 'otherenzymealpha-glucosidasepcba-aid23261069268500miscellaneousANS1pcba-aid24512008', 'MassivelyMultitaskNetworksforDrugDiscoveryDatasetActivesInactivesTargetClassTargetpcba-aid255116666288772transcriptionfac-torRORgammapcba-aid2662110293953miscellaneousMLL-HOX-Apcba-aid267599279333miscellaneousMBNL1-CUGpcba-aid26761081361124GPCRRXFP1pcba-aid463254*41330640proteaseUSP2apcba-aid485281254341253miscellaneousapoferritinpcba-aid485290942343503otherenzymeTDP1pcba-aid485294*148362056otherenzymeAmpCpcba-aid4852979126311481promoterRab9pcba-aid4853137567313119promoterNPC1pcba-aid4853144491329974otherenzymeDNApolymerasebetapcba-aid485341*1729328952otherenzymeAmpCpcba-aid485349618321745proteinkinaseATMpcba-aid485353603328042proteasePLPpcba-aid4853601485223830protein-proteininteractionL3MBTL1pcba-aid48536410700345950otherenzymeTGRpcba-aid485367557330124otherenzymePFKpcba-aid49294780330601GPCRbeta2-ARpcba-aid49320834243647proteinkinasemTORpcba-aid504327759380820otherenzymeGCN5L2pcba-aid50433230586317753otherenzymeG9apcba-aid50433315670341165protein-proteininteractionBAZ2Bpcba-aid50433916857367661protein-proteininteractionJMJD2Apcba-aid5044447390353475transcriptionfac-torNrf2pcba-aid5044664169325944viabilityHEK293T-ELG1-lucpcba-aid5044677647322464promoterELG1pcba-aid504706201321230miscellaneousp53pcba-aid504842101329517otherenzymeMm-CPNpcba-aid504845104385400miscellaneousRGS4pcba-aid5048473515390525transcriptionfac-torVDRpcba-aid50489134383652otherenzymePin1pcba-aid540276*4494279673miscellaneousMarburgviruspcba-aid5403172126381226protein-proteininteractionHP1-betapcba-aid588342*25034335826otherenzymeluciferasepcba-aid588453*3921382731otherenzymeTrxR1pcba-aid588456*51386206otherenzymeTrxR1pcba-aid5885791987', 'MassivelyMultitaskNetworksforDrugDiscoveryDatasetActivesInactivesTargetClassTargetpcba-aid602310310402026protein-proteininteractionVif-A3Gpcba-aid602313762383076protein-proteininteractionVif-A3Fpcba-aid60233270415773promoterGRP78pcba-aid624170837404440otherenzymeGLSpcba-aid6241711239402621transcriptionfac-torNrf2pcba-aid624173488406224otherenzymePYKpcba-aid6242023968372045promoterBRCA1pcba-aid624246101367273miscellaneousERGpcba-aid624287423334388signallingpath-wayGsgsppcba-aid6242881356336077signallingpath-wayGsgsppcba-aid624291222345619promotera7pcba-aid624296*9841333378miscellaneousDNAre-replicationpcba-aid624297*6214336050miscellaneousDNAre-replicationpcba-aid6244176388398731GPCRGLP-1pcba-aid6516353784387779promoterATXNpcba-aid651644748361115miscellaneousVprpcba-aid6517681677362320otherenzymeWRNpcba-aid651965', '6422331953', 'MassivelyMultitaskNetworksforDrugDiscoveryFigureA.3.Multitaskperformanceofduplicateanduniquetargets.Outliersareomittedforclarity.Notchesindicateadenceintervalaroundthemedian,computedas1:57IQR=pN(McGilletal.,1978', 'MassivelyMultitaskNetworksforDrugDiscoveryB.PerformancemetricsTableB.1.SigntestCIsforeachgroupofdatasets.EachmodeliscomparedtothePyramidal(2000', ';100)MultitaskNeuralNet,.25Dropoutmodel.ModelPCBA(n=128)MUV(n=17)Tox21(n=12)LogisticRegression(LR)[:3;:11][:13;:53][:00;:24]RandomForest(RF)[:05;:16][:00;:18][:14;:61]Single-TaskNeuralNet(STNN)[:02;:10][:13;:53][:00;:24]Pyramidal(2000', ';100)STNN,.25Dropout(PSTNN)[:05;:15][:13;:53][:00;:24]MaxfLR,RF,STNN,PSTNNg[:09;:21][:13;:53][:14;:61]1-Hidden(1200)LayerMultitaskNeuralNet(MTNN)[:05;:15][:22;:64][:01;:35]TableB.2.EnrichmentscoresforallmodelsreportedinTable2.Eachvalueisthemedianacrossthedatasetsinagroupofthemeank-foldenrichmentvalues.Enrichmentisanalternatemeasureofmodelperformancecommoninvirtualdrugscreening.WeusethefiROCenrichmentflfrom(Jain&Nicholls,2008', 'MassivelyMultitaskNetworksforDrugDiscoveryFigureB.1.GraphicalrepresentationofdatafromTable2inthetext.Notchesindicateaintervalaroundthemedian,computedas1:57IQR=pN(McGilletal.,1978', 'MassivelyMultitaskNetworksforDrugDiscoveryC.TrainingDetailsThemultitasknetworksinTable2weretrainedwithlearningrate:0003andbatchsize128for50Mstepsusingstochasticgradientdescent.Weightswereinitializedfromazero-meanGaussianwithstandarddeviation:01.Thebiaswasinitializedat:5.Weexperimentedwithhigherlearningrates,butfoundthatthepyramidalnetworkssometimesfailedtotrain(thetophiddenlayerzeroeditselfout).However,thiseffectvanishedwiththelowerlearningrate.Mostofthemodelsweretrainedwith64simultaneousreplicassharingtheirgradientupdates,butinsomecasesweusedasmanyas256.Thepyramidalsingle-tasknetworksweretrainedwiththesamesettings,butfor100Ksteps.Thevanillasingle-tasknetworksweretrainedwithlearningrate:001for100Ksteps.ThenetworksusedinFigure3andFigure4weretrainedwithlearningrate0:003for500epochsplusaconstant3millionsteps.Theconstantfactorwasintroducedafterweobservedthatthesmallermultitasknetworksrequiredmoreepochsthanthelargernetworkstostabilize.ThenetworksinFigure5weretrainedwithaPyramidal(1000,50)SingleTaskarchitecture(matchingthenetworksinFigure3).TheweightswereinitializedwiththeweightsfromthenetworksrepresentedinFigure3andthentrainedfor100Kstepswithalearningrateof0.0003.Aswenotedinthemaintext,thedatasetsinourcollectioncontainedmanymoreinactivethanactivecompounds.Toensuretheactivesweregivenadequateimportanceduringtraining,weweightedtheactivesforeachdatasettohavetotalweightequaltothenumberofinactivesforthatdataset(inactivesweregivenunitweight).TableC.1containstheresultsofourpyramidalmodelsensitivityanalysis.TablesC.2andC.3giveresultsforavarietyofadditionalmodelsnotreportedinTable2.TableC.1.Pyramidsensitivityanalysis.Median5-fold-average-AUCvaluesaregivenforseveralvariationsofthepyramidalarchitec-ture.Inanattempttoavoidtheproblemoftrainingfailuresduetothetoplayerbecomingallzeroearlyinthetraining,thelearningratewassetto0.0001forthe2Mstepsthento0.0003for28Msteps.ModelPCBA(n=128)MUV(n=17)Tox21(n=12)Pyramidal(1000;50)MTNN:846:825:799Pyramidal(1000;100)MTNN:845:818:796Pyramidal(1000;150)MTNN:842:812:798Pyramidal(2000', ';50)MTNN:846:819:794Pyramidal(2000', ';100)MTNN:846:821:798Pyramidal(2000', 'MassivelyMultitaskNetworksforDrugDiscoveryTableC.2.Descriptionsforadditionalmodels.MTNN:multitaskneuralnet.fiAuxiliaryheadsflreferstotheattachmentofindependentsoftmaxunitsforeachtasktohiddenlayers(seeSzegedyetal.,2014', ').Unlessotherwisemarked,assume10Mtrainingsteps.A8-Hidden(300)LayerMTNN,auxiliaryheadsattachedtohiddenlayers3and6,6MstepsB1-Hidden(3000)LayerMTNN,1MstepsC1-Hidden(3000)LayerMTNN,1.5MstepsDPyramidal(1800;100),2deep,reconnected(originalinputconcatenatedtopyramidoutput)EPyramidal(1800;100),3deepF4-Hidden(1000)LayerMTNN,auxiliaryheadsattachedtohiddenlayer2,4.5MstepsGPyramidal(2000', ';100)MTNN,10%connectedHPyramidal(2000', ';100)MTNN,50%connectedIPyramidal(2000', ';100)MTNN,:001learningrateJPyramidal(2000', ';100)MTNN,50Msteps,:0003learningrateKPyramidal(2000', ';100)MTNN,:25Dropoutlayeronly),50MstepsLPyramidal(2000', 'Jain,AjayNandNicholls,Anthony.Recommendationsforevaluationofcomputationalmethods.Journalofcomputer-aidedmoleculardesign,22(3-4):133Œ139,2008.', 'McGill,Robert,Tukey,JohnW,andLarsen,WayneA.Variationsofboxplots.TheAmericanStatistician,32(1):12Œ16,1978.', 'Szegedy,Christian,Liu,Wei,Jia,Yangqing,Sermanet,Pierre,Reed,Scott,Anguelov,Dragomir,Erhan,Dumitru,Van-houcke,Vincent,andRabinovich,Andrew.Goingdeeperwithconvolutions.arXivpreprintarXiv:1409.4842,2014.']
['[1]Mart´Abadi,AshishAgarwal,PaulBarham,EugeneBrevdo,ZhifengChen,CraigCitro,GregS.Corrado,AndyDavis,JeffreyDean,MatthieuDevin,SanjayGhe-mawat,IanGoodfellow,AndrewHarp,GeoffreyIrv-ing,MichaelIsard,YangqingJia,RafalJozefowicz,LukaszKaiser,ManjunathKudlur,JoshLevenberg,DanMan´e,RajatMonga,SherryMoore,DerekMurray,ChrisOlah,MikeSchuster,JonathonShlens,BenoitSteiner,IlyaSutskever,KunalTalwar,PaulTucker,VincentVanhoucke,VijayVasudevan,FernandaVi´egas,OriolVinyals,PeteWarden,MartinWattenberg,MartinWicke,YuanYu,andXiaoqiangZheng.TensorFlow:Large-scalemachinelearningonheterogeneoussystems,2015.Soft-', 'wareavailablefromw.org.[2]AneliaAngelova,AlexKrizhevsky,andVincentVan-houcke.Pedestriandetectionwithalarwdeepnetwork.InRoboticsandAutomation(ICRA),2015', 'IEEEInternationalConferenceon,pages704Œ711.IEEE,2015.', 'CalTechPDF.[3]ArvindandDavidE.Culler.Annualreviewofcomputersciencevol.1,1986.chapter', 'wArchitectures,pages225Œ253.1986.', 'www.dtic.mil/cgi-bin/GetTRDoc?Location=U2&doc=GetTRDoc.pdf&AD=ADA166235.[4]ArvindandRishiyurS.Nikhil.Executingapro-gramontheMITtagged-tokenwarchitec-ture.IEEETrans.Comput.,39(3):300Œ318,1990.', 'dl.acm.org/citation.cfm?id=78583.[5]JimmyBa,VolodymyrMnih,andKorayKavukcuoglu.Multipleobjectrecognitionwithvisualatten-tion.arXivpreprintarXiv:1412.7755,2014.', 'arxiv.org/abs/1412.7755.[6]Franc¸oiseBeaufays.TheneuralnetworksbehindGoogleVoicetranscription,2015.', 'googleresearch.blogspot.com/2015/08/the-neural-', 'networks-behind-google-voice.html.[7]JamesBergstra,OlivierBreuleux,Fr´ed´ericBastien,Pas-calLamblin,RazvanPascanu,GuillaumeDesjardins,JosephTurian,DavidWarde-Farley,andYoshuaBengio.Theano:ACPUandGPUmathexpressioncompiler.InProceedingsofthePythonforcomputingcon-ference(SciPy),volume4,page3.Austin,TX,2010.', 'UMontrealPDF.[8]CraigChambers,AshishRaniwala,FrancesPerry,StephenAdams,RobertRHenry,RobertBradshaw,andNathanWeizenbaum.FlumeJava:easy,efcientdata-parallelpipelines.InACMSigplanNo-tices,volume45,pages363Œ375.ACM,2010.', 're-search.google.com/pubs/archive/35650.pdf.[9]SharanChetlur,CliffWoolley,PhilippeVandermer-sch,JonathanCohen,JohnTran,BryanCatanzaro,andEvanShelhamer.cuDNN:Efprimitivesfordeeplearning.arXivpreprintarXiv:1410.0759,2014.', 'arxiv.org/abs/1410.0759.[10]TrishulChilimbi,YutakaSuzue,JohnsonApacible,andKarthikKalyanaraman.ProjectAdam:Buildinganefandscalabledeeplearningtrainingsystem.In11thUSENIXSymposiumonOperatingSystemsDesignandImplementation(OSDI14),pages571Œ582,2014.', 'www.usenix.orpaper-chilimbi.pdf.[11]JackClark.GoogleturningitslucrativewebsearchovertoAImachines,2015.', 'www.bloomberg.com/news/articles/2015-10-26/google-', 'turning-its-lucrative-web-search-over-to-ai-machines.[12]CliffClick.Globalcodemotion/globalvaluenumber-ing.InACMSIGPLANNotices,volume30,pages246Œ257.ACM,1995.', 'courses.cs.washington.edu/courses/cse501/06wi/reading/click-pldi95.pdf.[13]RonanCollobert,SamyBengio,andJohnnyMari´ethoz.Torch:Amodularmachinelearningsoftwarelibrary.Technicalreport,IDIAP,2002.', 'KeYang,andAndrewY.Ng.Largescaledistributeddeepnetworks.InNIPS,2012.', 'GoogleResearchPDF.[15]JackJDongarra,JeremyDuCroz,SvenHammar-ling,andIainSDuff.Asetoflevel3basiclin-earalgebrasubprograms.ACMTransactionsonMathematicalSoftware(TOMS),16(1):1Œ17,1990.', 'www.maths.manchester.ac.uk/Ÿsven/pubs/Level3BLAS-1-TOMS16-90.pdf.[16]AndreaFrome,GregSCorrado,JonathonShlens,SamyBengio,JeffDean,TomasMikolov,etal.DeVISE:Adeepvisual-semanticembeddingmodel.InAdvancesinNeuralInformationPro-cessingSystems,pages2121Œ2129,2013.re-search.google.com/pubs/archive/41473.pdf.[17]JavierGonzalez-Dominguez,IgnacioLopez-Moreno,Pe-droJMoreno,andJoaquinGonzalez-Rodriguez.Frame-by-framelanguageinshortutterancesusingdeepneuralnetworks.NeuralNetworks,64:49Œ58,2015.', '[18]OtavioGood.HowGoogleTranslatesqueezesdeeplearningontoaphone,2015.', 'googleresearch.blogspot.com/2015/07/how-google-', 'translate-squeezes-deep.html.[19]IanJ.Goodfellow,YaroslavBulatov,JulianIbarz,SachaArnoud,andVinayShet.Multi-digitnumberrecognitionfromStreetViewimageryusingdeepconvolutionalneu-ralnetworks.InInternationalConferenceonLearningRepresentations,2014.', "arxiv.org/pdf/1312.6082.[20]GeorgHeigold,VincentVanhoucke,AlanSenior,PatrickNguyen,Marc'AurelioRanzato,MatthieuDevin,andJeffreyDean.Multilingualacousticmodelsusingdis-tributeddeepneuralnetworks.InAcoustics,SpeechandSignalProcessing(ICASSP),2013IEEEInterna-", 'tionalConferenceon,pages8619Œ8623.IEEE,2013.', 're-search.google.com/pubs/archive/40807.pdf.[21]GeoffreyE.Hinton,LiDeng,DongYu,GeorgeE.Dahl,Abdel-rahmanMohamed,NavdeepJaitly,An-drewSenior,VincentVanhoucke,PatrickNguyen,TaraN.Sainath,andBrianKingsbury.Deepneuralnetworksforacousticmodelinginspeechrecognition:Thesharedviewsoffourresearchgroups.IEEESignalProcess.Mag.,29(6):82Œ97,2012.', 'www.cs.toronto.edu/Ÿgdahl/papers/deepSpeechReviewSPM2012.pdf', '.[22]SeppHochreiterandJ¨urgenSchmidhuber.Longshort-termmemory.Neuralcomputation,9(8):1735Œ1780,1997.', 'ftp.idsia.ch/pub/juergen/lstm.pdf.[23]SergeyIoffeandChristianSzegedy.Batchnormaliza-tion:Acceleratingdeepnetworktrainingbyreducinginternalcovariateshift.CoRR,abs/1502.03167,2015.arxiv.org/abs/1502.03167.[24]MichaelIsard,MihaiBudiu,YuanYu,AndrewBirrell,andDennisFetterly.Dryad:distributeddata-parallelprogramsfromsequentialbuildingblocks.InACMSIGOPSOperatingSystemsReview,volume41,pages59Œ72.ACM,2007.', 'www.michaelisard.com/pubs/eurosys07.pdf.[25]Beno‹Jacob,Ga¨elGuennebaud,etal.Eigenlibraryforlinearalgebra.eigen.tuxfamily.org.[26]YangqingJia,EvanShelhamer,JeffDonahue,SergeyKarayev,JonathanLong,RossGirshick,SergioGuadar-rama,andTrevorDarrell.Caffe:Convolutionalarchi-tectureforfastfeatureembedding.InProceedingsoftheACMInternationalConferenceonMultimedia,pages675Œ678.ACM,2014.', 'arxiv.org/pdf/1408.5093.[27]AndrejKarpathy,GeorgeToderici,SachinShetty,TommyLeung,RahulSukthankar,andLiFei-Fei.Large-scalevideowithcon-volutionalneuralnetworks.InComputerVisionandPatternRecognition(CVPR),2014IEEECon-', 'ferenceon,pages1725Œ1732.IEEE,2014.re-search.google.com/pubs/archive/42455.pdf.[28]AKrizhevsky.Cuda-convnet,2014.', 'code.google.com/p/cuda-convnet/.[29]AlexKrizhevsky.Oneweirdtrickforparalleliz-ingconvolutionalneuralnetworks.arXivpreprintarXiv:1404.5997,2014.', "arxiv.org/abs/1404.5997.[30]AlexKrizhevsky,VinodNair,andGeoffreyHinton.TheCIFAR-10dataset.www.cs.toronto.edu/Ÿkriz/cifar.html.[31]QuocLe,Marc'AurelioRanzato,RajatMonga,MatthieuDevin,GregCorrado,KaiChen,JeffDean,andAndrewNg.Buildinghigh-levelfeaturesusinglargescaleunsu-pervisedlearning.InICML'2012", ',2012.', 'GoogleResearchPDF.[32]YannLeCun,CorinnaCortes,andChristopherJCBurges.TheMNISTdatabaseofhandwrittendigits,1998.', 'yann.lecun.com/exdb/mnist/.[33]MuLi,DaveAndersen,andAlexSmola.Parameterserver.parameterserver.org.[34]ChrisJMaddison,AjaHuang,IlyaSutskever,andDavidSilver.MoveevaluationinGousingdeepconvolutionalneuralnetworks.arXivpreprintarXiv:1412.6564,2014.', 'arxiv.org/abs/1412.6564.[35]TomasMikolov,KaiChen,GregCorrado,andJef-freyDean.Efestimationofwordrepresenta-tionsinvectorspace.InInternationalConferenceonLearningRepresentations:WorkshopsTrack,2013.', 'arxiv.org/abs/1301.3781.[36]DerekGMurray,FrankMcSherry,RebeccaIsaacs,MichaelIsard,PaulBarham,andMart´Abadi.Naiad:atimelywsystem.InProceedingsoftheTwenty-FourthACMSymposiumonOperatingSystemsPrinci-ples,pages439Œ455.ACM,2013.', 'MicrosoftResearchPDF.[37]DerekG.Murray,MalteSchwarzkopf,ChristopherSmowton,StevenSmit,AnilMadhavapeddy,andStevenHand.Ciel:auniversalexecutionenginefordis-tributedwcomputing.InProceedingsoftheNinthUSENIXSymposiumonNetworkedSystemsDesignandImplementation,2011.', '[38]ArunNair,PraveenSrinivasan,SamBlackwell,CagdasAlcicek,RoryFearon,AlessandroDeMaria,Ve-davyasPanneershelvam,MustafaSuleyman,CharlesBeattie,StigPetersen,etal.Massivelyparallelmeth-odsfordeepreinforcementlearning.arXivpreprintarXiv:1507.04296,2015.', 'arxiv.org/abs/1507.04296.[39]CUDANvidia.Cublaslibrary.NVIDIACorpo-ration,SantaClara,California,15,2008.', 'devel-oper.nvidia.com/cublas.[40]JonathanRagan-Kelley,ConnellyBarnes,AndrewAdams,SylvainParis,Fr´edoDurand,andSamanAma-rasinghe.Halide:Alanguageandcompilerforoptimiz-ingparallelism,locality,andrecomputationinimagepro-cessingpipelines.ACMSIGPLANNotices,48(6):519Œ530,2013.', 'people.csail.mit.edu/fredo/tmp/Halide-5min.pdf.[41]BharathRamsundar,StevenKearnes,PatrickRiley,DaleWebster,DavidKonerding,andVijayPande.Massivelymultitasknetworksfordrugdiscovery.arXivpreprintarXiv:1502.02072,2015.', 'arxiv.org/abs/1502.02072.[42]BenjaminRecht,ChristopherRe,StephenWright,andFengNiu.Hogwild:Alock-freeapproachtoparal-lelizingstochasticgradientdescent.InAdvancesinNeuralInformationProcessingSystems,pages693Œ701,2011.', 'papers.nips.cc/paper/4390-hogwild-a-lock-free-approach-to-parallelizing-stochastic-gradient-descent.[43]ChuckRosenberg.ImprovingPhotoSearch:Astepacrossthesemanticgap,2013.', 'googleresearch.blogspot.com/2013/06/improving-', 'photo-search-step-across.html.[44]ChristopherJRossbach,YuanYu,JonCurrey,Jean-PhilippeMartin,andDennisFetterly.Dandelion:acompilerandruntimeforheterogeneoussystems.InProceedingsoftheTwenty-FourthACMSymposiumonOperatingSystemsPrinciples,pages49Œ68.ACM,2013.', 'research-srv.microsoft.com/pubs/201110/sosp13-', '.[45]DavidERumelhart,GeoffreyEHinton,andRonaldJWilliams.Learningrepresentationsbyback-propagatingerrors.Cognitivemodeling,5:3,1988.', 'www.cs.toronto.edu/hinton/absps/naturebp.pdf.[46]Has¸imSak,AndrewSenior,KanishkaRao,Franc¸oiseBeaufays,andJohanSchalkwyk.GoogleVoiceSearch:fasterandmoreaccurate,2015.', 'googleresearch.blogspot.com/2015/09/google-voice-', 'search-faster-and-more.html.[47]IlyaSutskever,OriolVinyals,andQuocV.Le.Sequencetosequencelearningwithneuralnetworks.InNIPS,2014.', "papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural.[48]ChristianSzegedy,WeiLiu,YangqingJia,PierreSer-manet,ScottReed,DragomirAnguelov,DumitruEr-han,VincentVanhoucke,andAndrewRabinovich.Go-ingdeeperwithconvolutions.InCVPR'2015", ',2015.', 'arxiv.org/abs/1409.4842.[49]SeiyaTokui.Chainer:Apowerful,xibleandintuitiveframeworkofneuralnetworks.chainer.org.[50]VincentVanhoucke.Speechrecognitionanddeeplearn-ing,2015.', 'googleresearch.blogspot.com/2012/08/speech-', 'recognition-and-deep-learning.html.[51]AbhishekVerma,LuisPedrosa,MadhukarKorupolu,DavidOppenheimer,EricTune,andJohnWilkes.Large-scaleclustermanagementatGooglewithBorg.InProceedingsoftheTenthEuropeanConferenceonComputerSystems,page18.ACM,2015.', 're-search.google.com/pubs/archive/43438.pdf.[52]O.Vinyals,L.Kaiser,T.Koo,S.Petrov,I.Sutskever,andG.Hinton.Grammarasaforeignlanguage.Technicalreport,arXiv:1412.7449,2014.arxiv.org/abs/1412.7449.[53]OriolVinyals,MeireFortunato,andNavdeepJaitly.Pointernetworks.InNIPS,2015.', 'arxiv.org/abs/1506.03134.[54]DongYu,AdamEversole,MikeSeltzer,KaishengYao,ZhihengHuang,BrianGuenter,OleksiiKuchaiev,YuZhang,FrankSeide,HuamingWang,etal.Anintroductiontocomputationalnetworksandthecom-putationalnetworktoolkit.Technicalreport,Tech.Rep.MSR,MicrosoftResearch,2014,2014.', 're-search.microsoft.com/apps/pubs/?id=226641.[55]MateiZaharia,MosharafChowdhury,TathagataDas,AnkurDave,JustinMa,MurphyMcCauley,MichaelJFranklin,ScottShenker,andIonStoica.Resilientdistributeddatasets:Afault-tolerantabstractionforin-memoryclustercomputing.InProceedingsofthe9thUSENIXconferenceonNetworkedSystemsDe-signandImplementation.USENIXAssociation,2012.', "www.usenix.or.[56]MatthewD.Zeiler,Marc'AurelioRanzato,RajatMonga,MarkMao,KeYang,QuocLe,PatrickNguyen,AndrewSenior,VincentVanhoucke,JeffDean,andGeoffreyE.Hinton.Onlinearunitsforspeechprocessing.InICASSP,2013."]