中文字幕在线观看,亚洲а∨天堂久久精品9966,亚洲成a人片在线观看你懂的,亚洲av成人片无码网站,亚洲国产精品无码久久久五月天

自動(dòng)機(jī)器學(xué)習(xí):最近進(jìn)展研究綜述

2019-08-12    來(lái)源:raincent

容器云強(qiáng)勢(shì)上線!快速搭建集群,上萬(wàn)Linux鏡像隨意使用

本文作者為香港浸會(huì)大學(xué) 賀鑫。

英文標(biāo)題 | AutoML:A survey of State-of-the-art

作  者 | Xin He, Kaiyong Zhao, Xiaowen Chu

單  位 | Hong Kong Baptist University(香港浸會(huì)大學(xué))

論文鏈接 | https://arxiv.org/abs/1908.00709

深度學(xué)習(xí)已經(jīng)運(yùn)用到多個(gè)領(lǐng)域,為人們生活帶來(lái)極大便利。然而,為特定任務(wù)構(gòu)造一個(gè)高質(zhì)量的深度學(xué)習(xí)系統(tǒng)不僅需要耗費(fèi)大量時(shí)間和資源,而且很大程度上需要專業(yè)的領(lǐng)域知識(shí)。

因此,為了讓深度學(xué)習(xí)技術(shù)以更加簡(jiǎn)單的方式應(yīng)用到更多的領(lǐng)域,自動(dòng)機(jī)器學(xué)習(xí)(AutoML)逐漸成為人們關(guān)注的重點(diǎn)。

本文首先從端到端系統(tǒng)的角度總結(jié)了自動(dòng)機(jī)器學(xué)習(xí)在各個(gè)流程中的研究成果(如下圖),然后著重對(duì)最近廣泛研究的神經(jīng)結(jié)構(gòu)搜索(Neural Architecture Search, NAS)進(jìn)行了總結(jié),最后討論了一些未來(lái)的研究方向。

 

 

一、數(shù)據(jù)準(zhǔn)備

眾所周知,數(shù)據(jù)對(duì)于深度學(xué)習(xí)任務(wù)而言至關(guān)重要,因此一個(gè)好的AutoML系統(tǒng)應(yīng)該能夠自動(dòng)提高數(shù)據(jù)質(zhì)量和數(shù)量,我們將數(shù)據(jù)準(zhǔn)備劃分成兩個(gè)部分:數(shù)據(jù)收集和數(shù)據(jù)清洗。

 

 

1、數(shù)據(jù)收集

現(xiàn)如今不斷有公開(kāi)數(shù)據(jù)集涌現(xiàn)出來(lái),例如MNIST,CIFAR10,ImageNet等等。我們也可以通過(guò)一些公開(kāi)的網(wǎng)站獲取各種數(shù)據(jù)集,例如Kaggle, Google Dataset Search以及Elsevier Data Search等等。但是對(duì)于一些特殊的任務(wù),尤其是醫(yī)療或者涉及到個(gè)人隱私的任務(wù),由于數(shù)據(jù)很難獲取,所以通常很難找到一個(gè)合適的數(shù)據(jù)集或者數(shù)據(jù)集很小。解決這一問(wèn)題主要有兩種思路:數(shù)據(jù)生成和數(shù)據(jù)搜索。

1)數(shù)據(jù)生成

圖像:

Cubuk, EkinD., et al. "Autoaugment: Learning augmentation policies fromdata." arXiv preprint arXiv:1805.09501 (2018).

語(yǔ)音:

Park, DanielS., et al. "Specaugment: A simple data augmentation method for automaticspeech recognition." arXiv preprint arXiv:1904.08779(2019).

文本:

Xie, Ziang,et al. "Data noising as smoothing in neural network languagemodels." arXiv preprint arXiv:1703.02573 (2017).

Yu, Adams Wei,et al. "Qanet: Combining local convolution with global self-attention forreading comprehension." arXiv preprint arXiv:1804.09541 (2018).

GAN:

Karras, Tero, Samuli Laine, and Timo Aila. "A style-basedgenerator architecture for generative adversarial networks." Proceedingsof the IEEE Conference on Computer Vision and Pattern Recognition. 2019.

模擬器:

Brockman, Greg, et al. "Openai gym." arXiv preprintarXiv:1606.01540 (2016).

2)數(shù)據(jù)搜索

Roh, Yuji, Geon Heo, and Steven Euijong Whang. "A survey ondata collection for machine learning: a big data-ai integrationperspective." arXiv preprint arXiv:1811.03402(2018).

Yarowsky, David. "Unsupervised word sense disambiguationrivaling supervised methods." 33rd annual meeting of the associationfor computational linguistics. 1995.

Zhou, Yan, and Sally Goldman. "Democraticco-learning." 16th IEEE International Conference on Tools withArtificial Intelligence. IEEE, 2004.

2、數(shù)據(jù)清洗

Krishnan,Sanjay, and Eugene Wu. "Alphaclean: Automatic generation of data cleaningpipelines." arXiv preprint arXiv:1904.11827 (2019).

Chu, Xu, etal. "Katara: A data cleaning system powered by knowledge bases andcrowdsourcing." Proceedings of the 2015 ACM SIGMOD InternationalConference on Management of Data. ACM, 2015.

Krishnan,Sanjay, et al. "Activeclean: An interactive data cleaning framework formodern machine learning." Proceedings of the 2016 InternationalConference on Management of Data. ACM, 2016.

Krishnan,Sanjay, et al. "SampleClean: Fast and Reliable Analytics on DirtyData." IEEE Data Eng. Bull. 38.3 (2015): 59-75.

二、特征工程

特征工程可分為三個(gè)部分:

1、特征選擇

2、特征構(gòu)造

H. Vafaie and K. De Jong, “Evolutionary feature spacetransformation,” in Feature Extraction, Construction and Selection. Springer,1998, pp. 307–323

J. Gama, “Functional trees,” Machine Learning, vol. 55, no. 3, pp.219–250, 2004.

D. Roth and K. Small, “Interactive feature space construction usingsemantic information,” in Proceedings of the Thirteenth Conference onComputational Natural Language Learning. Association for Computational Linguistics,2009, pp. 66–74.

3、特征提取

Q. Meng, D. Catchpoole, D. Skillicom, and P. J. Kennedy, “Relationalautoencoder for feature extraction,” in 2017 International Joint Conference onNeural Networks (IJCNN). IEEE, 2017, pp. 364–371.

O. Irsoy and E. Alpayd?n, “Unsupervised feature extraction withautoencoder trees,” Neurocomputing, vol. 258, pp. 63–73, 2017.

 

 

三、模型生成

模型生成的方式主要有兩種:一是基于傳統(tǒng)的機(jī)器學(xué)習(xí)方法生成模型,例如SVM,decision tree等,已經(jīng)開(kāi)源的庫(kù)有Auto-sklearn和TPOT等。另一種是是神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)搜索(NAS)。我們會(huì)從兩個(gè)方面對(duì)NAS進(jìn)行總結(jié),一是NAS的網(wǎng)絡(luò)結(jié)構(gòu),二是搜索策略。

 

 

1、網(wǎng)絡(luò)結(jié)構(gòu)

1)整體結(jié)構(gòu)(entire structure):

該類方法是生成一個(gè)完整的網(wǎng)絡(luò)結(jié)構(gòu)。其存在明顯的缺點(diǎn),如網(wǎng)絡(luò)結(jié)構(gòu)搜索空間過(guò)大,生成的網(wǎng)絡(luò)結(jié)構(gòu)缺乏可遷移性和靈活性。

B. Zoph and Q. V. Le, “Neural architecture search with reinforcementlearning.” [Online]. Available:http://arxiv.org/abs/1611.01578

H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean, “Efficientneural architecture search via parameter sharing,” vol. ICML. [Online].Available: http://arxiv.org/abs/1802.03268

2)基于單元結(jié)構(gòu)(cell-based structure):

為解決整體結(jié)構(gòu)網(wǎng)絡(luò)搜索存在的問(wèn)題提出了基于單元結(jié)構(gòu)設(shè)計(jì)的方法。如下圖所示,搜索到單元結(jié)構(gòu)后需要疊加若干個(gè)單元結(jié)構(gòu)便可得到最終的網(wǎng)絡(luò)結(jié)構(gòu)。不難發(fā)現(xiàn),搜索空間從整個(gè)網(wǎng)絡(luò)縮減到了更小的單元結(jié)構(gòu),而且我們可以通過(guò)增減單元結(jié)構(gòu)的數(shù)量來(lái)改變網(wǎng)絡(luò)結(jié)構(gòu)。但是這種方法同樣存在一個(gè)很明顯的問(wèn)題,即單元結(jié)構(gòu)的數(shù)量和連接方式不確定,現(xiàn)如今的方法大都是依靠人類經(jīng)驗(yàn)設(shè)定。

 

 

H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean, “Efficientneural architecture search via parameter sharing,” vol. ICML. [Online].Available: http://arxiv.org/abs/1802.03268

B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learningtransferable architectures for scalable image recognition.” [Online].Available: http://arxiv.org/abs/1707.07012

Z. Zhong, J. Yan, W. Wu, J. Shao, and C.-L. Liu, “Practicalblock-wise neural network architecture generation.” [Online]. Available:http://arxiv.org/abs/1708.05552

B. Baker, O. Gupta, N. Naik, and R. Raskar, “Designing neuralnetwork architectures using reinforcement learning,” vol. ICLR. [Online].Available: http://arxiv.org/abs/1611.02167

E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q.Le, and A. Kurakin, “Large-scale evolution of image classifiers.” [Online].Available: http://arxiv.org/abs/1703.01041

E. Real, A. Aggarwal, Y. Huang, and Q. V. Le, “Regularized evolutionfor image classifier architecture search.” [Online]. Available:http://arxiv.org/abs/1802.01548

3)層次結(jié)構(gòu)(hierarchical structure)

不同于上面將單元結(jié)構(gòu)按照鏈?zhǔn)降姆椒ㄟM(jìn)行連接,層次結(jié)構(gòu)是將前一步驟生成的單元結(jié)構(gòu)作為下一步單元結(jié)構(gòu)的基本組成部件,通過(guò)迭代的思想得到最終的網(wǎng)絡(luò)結(jié)構(gòu)。如下圖所示,(a)中左邊3個(gè)是最基本的操作,右邊是基于這些基本操作生成的某一個(gè)單元結(jié)構(gòu);(b)中左邊展示了上一步驟中生成的若干個(gè)單元結(jié)構(gòu),通過(guò)按照某種策略將這些單元結(jié)構(gòu)進(jìn)行組合得到了更高階的單元結(jié)構(gòu)。

H. Liu, K. Simonyan, O. Vinyals, C. Fernando, and K. Kavukcuoglu,“Hierarchical representations for efficient architecture search,” in ICLR, p.13

 

 

4)基于網(wǎng)絡(luò)態(tài)射結(jié)構(gòu)(network morphism-based structure):

一般的網(wǎng)絡(luò)設(shè)計(jì)方法是首先設(shè)計(jì)出一個(gè)網(wǎng)絡(luò)結(jié)構(gòu),然后訓(xùn)練它并在驗(yàn)證集上查看它的性能表現(xiàn),如果表現(xiàn)較差,則重新設(shè)計(jì)一個(gè)網(wǎng)絡(luò)?梢院苊黠@地發(fā)現(xiàn)這種設(shè)計(jì)方法會(huì)做很多無(wú)用功,因此耗費(fèi)大量時(shí)間。而基于網(wǎng)絡(luò)態(tài)射結(jié)構(gòu)方法能夠在原有的網(wǎng)絡(luò)結(jié)構(gòu)基礎(chǔ)上做修改,所以其在很大程度上能保留原網(wǎng)絡(luò)的優(yōu)點(diǎn),而且其特殊的變換方式能夠保證新的網(wǎng)絡(luò)結(jié)構(gòu)還原成原網(wǎng)絡(luò),也就是說(shuō)它的表現(xiàn)至少不會(huì)差于原網(wǎng)絡(luò)。

 

 

T. Chen, I. Goodfellow, and J. Shlens, “Net2net: Acceleratinglearning via knowledge transfer,” arXiv preprint arXiv:1511.05641, 2015.

T. Elsken, J. H. Metzen, and F. Hutter, “Efficient multi-objectiveneural architecture search via lamarckian evolution.” [Online]. Available:http://arxiv.org/abs/1804.09081

H. Cai, T. Chen, W. Zhang, Y. Yu, and J. Wang, “Efficientarchitecture search by network transformation,” in Thirty-Second AAAIConference on Artificial Intelligence, 2018

2、搜索策略

1)網(wǎng)格搜索

H. H. Hoos, Automated Algorithm Configuration and Parameter Tuning,2011

I. Czogiel, K. Luebke, and C. Weihs, Response surface methodologyfor optimizing hyper parameters. Universitatsbibliothek Dortmund, 2006.

C.-W. Hsu, C.-C. Chang, C.-J. Lin et al., “A practical guide tosupport vector classification,” 2003.

J. Y. Hesterman, L. Caucci, M. A. Kupinski, H. H. Barrett, and L. R.Furenlid, “Maximum-likelihood estimation with a contracting-grid searchalgorithm,” IEEE transactions on nuclear science, vol. 57, no. 3, pp.1077–1084, 2010.

2)隨機(jī)搜索

J. Bergstra and Y. Bengio, “Random search for hyper-parameteroptimization,” p. 25.

H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio,“An empirical evaluation of deep architectures on problems with many factors ofvariation,” in Proceedings of the 24th international conference on Machinelearning. ACM, 2007, pp. 473–480.

L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar,“Hyperband: A novel bandit-based approach to hyperparameter optimization.”[Online]. Available: http://arxiv.org/abs/1603.06560

3)強(qiáng)化學(xué)習(xí)

B. Zoph and Q. V. Le, “Neural architecture search with reinforcementlearning.” [Online]. Available: http://arxiv.org/abs/1611.01578

B. Baker, O. Gupta, N. Naik, and R. Raskar, “Designing neuralnetwork architectures using reinforcement learning,” vol. ICLR. [Online].Available: http://arxiv.org/abs/1611.02167

H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean, “Efficientneural architecture search via parameter sharing,” vol. ICML. [Online].Available: http://arxiv.org/abs/1802.03268

B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learningtransferable architectures for scalable image recognition.” [Online].Available: http://arxiv.org/abs/1707.07012

Z. Zhong, J. Yan, W. Wu, J. Shao, and C.-L. Liu, “Practical block-wiseneural network architecture generation.” [Online]. Available:http://arxiv.org/abs/1708.05552

4)進(jìn)化算法

L. Xie and A. Yuille, “Genetic CNN,” vol. ICCV. [Online]. Available:http://arxiv.org/abs/1703.01513

M. Suganuma, S. Shirakawa, and T. Nagao, “A genetic programmingapproach to designing convolutional neural network architectures.” [Online].Available: http://arxiv.org/abs/1704.00764

E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q.Le, and A. Kurakin, “Large-scale evolution of image classifiers.” [Online].Available: http://arxiv.org/abs/1703.01041

K. O. Stanley and R. Miikkulainen, “Evolving neural networks throughaugmenting topologies,” vol. 10, no. 2, pp. 99–127. [Online]. Available:http://www.mitpressjournals.org/doi/10.1162/106365602320169811

T. Elsken, J. H. Metzen, and F. Hutter, “Efficient multi-objectiveneural architecture search via lamarckian evolution.” [Online]. Available:http://arxiv.org/abs/1804.09081

5)貝葉斯算法

J. Gonzalez, “Gpyopt: A bayesian optimization framework in python,”http://github.com/SheffieldML/GPyOpt, 2016

J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesianoptimization of machine learning algorithms,” in Advances in neural informationprocessing systems, 2012, pp. 2951–2959.

S. Falkner, A. Klein, and F. Hutter, “BOHB: Robust and efficienthyperparameter optimization at scale,” p. 10.

F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Sequential model-basedoptimization for general algorithm configuration,” in Learning and IntelligentOptimization, C. A. C. Coello, Ed. Springer Berlin Heidelberg, vol. 6683, pp.507–523. [Online]. Available:http://link.springer.com/10.1007/978-3-642-25566-3 40

J. Bergstra, D. Yamins, and D. D. Cox, “Making a science of modelsearch: Hyperparameter optimization in hundreds of dimensions for visionarchitectures,” p. 9.

A. Klein, S. Falkner, S. Bartels, P. Hennig, and F. Hutter, “Fastbayesian optimization of machine learning hyperparameters on large datasets.”[Online]. Available: http://arxiv.org/abs/1605.07079

6)梯度下降算法

H. Liu, K. Simonyan, and Y. Yang, “DARTS: Differentiablearchitecture search.” [Online]. Available: http://arxiv.org/abs/1806.09055

S. Saxena and J. Verbeek, “Convolutional neural fabrics,” inAdvances in Neural Information Processing Systems, 2016, pp. 4053–4061.

K. Ahmed and L. Torresani, “Connectivity learning in multi-branchnetworks,” arXiv preprint arXiv:1709.09582, 2017.

R. Shin, C. Packer, and D. Song, “Differentiable neural networkarchitecture search,” 2018.

D. Maclaurin, D. Duvenaud, and R. Adams, “Gradient-basedhyperparameter optimization through reversible learning,” in InternationalConference on Machine Learning, 2015, pp. 2113–2122.

F. Pedregosa, “Hyperparameter optimization with approximategradient,” arXiv preprint arXiv:1602.02355, 2016.

S. H. Han Cai, Ligeng Zhu, “PROXYLESSNAS: DIRECT NEURAL ARCHITECTURESEARCH ON TARGET TASK AND HARDWARE,” 2019

G. D. H. Andrew Hundt, Varun Jain, “sharpDARTS: Faster and MoreAccurate Differentiable Architecture Search,” Tech. Rep. [Online]. Available:https://arxiv.org/pdf/1903.09900.pdf

四、模型評(píng)估

模型結(jié)構(gòu)設(shè)計(jì)好后我們需要對(duì)模型進(jìn)行評(píng)估,最簡(jiǎn)單的方法是將模型訓(xùn)練至收斂,然后根據(jù)其在驗(yàn)證集上的結(jié)果判斷其好壞。但是這種方法需要大量時(shí)間和計(jì)算資源。因此有不少加速模型評(píng)估過(guò)程的算法被提出,總結(jié)如下:

 

 

1、低保真度評(píng)估

Klein, S. Falkner, S. Bartels, P. Hennig, and F. Hutter, “Fastbayesian optimization of machine learning hyperparameters on large datasets.”[Online]. Available: http://arxiv.org/abs/1605.07079

B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learningtransferable architectures for scalable image recognition.” [Online]. Available:http://arxiv.org/abs/1707.07012

E. Real, A. Aggarwal, Y. Huang, and Q. V. Le, “Regularized evolutionfor image classi?er architecture search.” [Online]. Available:http://arxiv.org/abs/1802.01548

A. Zela, A. Klein, S. Falkner, and F. Hutter, “Towards automateddeep learning: Ef?cient joint neural architecture and hyperparameter search.” [Online].Available: http://arxiv.org/abs/1807.06906

Y.-q. Hu, Y. Yu, W.-w. Tu, Q. Yang, Y. Chen, and W. Dai,“Multi-Fidelity Automatic Hyper-Parameter Tuning via Transfer Series Expansion,” p. 8, 2019.

2、遷移學(xué)習(xí)

C. Wong, N. Houlsby, Y. Lu, and A. Gesmundo, “Transfer learning withneural automl,” in Advances in Neural Information Processing Systems, 2018, pp.8356–8365.

T. Wei, C. Wang, Y. Rui, and C. W. Chen, “Network morphism,” inInternational Conference on Machine Learning, 2016, pp. 564–572.

T. Chen, I. Goodfellow, and J. Shlens, “Net2net: Acceleratinglearning via knowledge transfer,” arXiv preprint arXiv:1511.05641, 2015.

3、基于代理(surrogate-based)

K. Eggensperger, F. Hutter, H. H. Hoos, and K. Leyton-Brown,“Surrogate benchmarks for hyperparameter optimization.” in MetaSel@ ECAI, 2014,pp. 24–31.

C. Wang, Q. Duan, W. Gong, A. Ye, Z. Di, and C. Miao, “An evaluationof adaptive surrogate modeling based optimization with two benchmark problems,”Environmental Modelling & Software, vol. 60, pp. 167–179,2014.

K. Eggensperger, F. Hutter, H. Hoos, and K. Leyton-Brown, “Ef?cientbenchmarking of hyperparameter optimizers via surrogates,” in Twenty-Ninth AAAIConference on Arti?cial Intelligence, 2015.

K. K. Vu, C. D’Ambrosio, Y. Hamadi, and L. Liberti, “Surrogate-basedmethods for black-box optimization,” International Transactions in OperationalResearch, vol. 24, no. 3, pp. 393–424, 2017.

C. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L.-J. Li, L.Fei-Fei, A. Yuille, J. Huang, and K. Murphy, “Progressive neural architecturesearch.” [Online]. Available: http://arxiv.org/abs/1712.00559

4、早停(early-stopping)

A. Klein, S. Falkner, J. T. Springenberg, and F. Hutter, “Learningcurve prediction with bayesian neural networks,” 2016.

B. Deng, J. Yan, and D. Lin, “Peephole: Predicting networkperformance before training,” arXiv preprint arXiv:1712.03351, 2017.

T. Domhan, J. T.Springenberg, and F. Hutter, “Speeding up automatic hyperparameter optimizationof deep neural networks by extrapolation of learning curves,” in Twenty-FourthInternational Joint Conference on Arti?cial Intelligence, 2015.

M. Mahsereci, L. Balles, C. Lassner, and P. Hennig, “Early stoppingwithout a validation set,” arXiv preprint arXiv:1703.09580, 2017.

五、NAS算法性能總結(jié)

 

 

下圖總結(jié)了不同NAS算法在CIFAR10上的搜索網(wǎng)絡(luò)所花費(fèi)的時(shí)間以及準(zhǔn)確率?梢钥吹较啾扔诨趶(qiáng)化學(xué)習(xí)和進(jìn)化算法的方法,基于梯度下降和隨機(jī)搜索的方法能夠使用更少的時(shí)間搜索得到表現(xiàn)優(yōu)異的網(wǎng)絡(luò)模型。

 

 

六、總結(jié)

通過(guò)對(duì)AutoML最新研究進(jìn)展的總結(jié)我們發(fā)現(xiàn)還有如下問(wèn)題值得思考和解決:

1、完整pipeline系統(tǒng)

現(xiàn)如今有不少開(kāi)源AutoML庫(kù),如TPOT,Auto-sklearn都只是涉及整個(gè)pipeline的某一個(gè)或多個(gè)過(guò)程,但是還沒(méi)有真正實(shí)現(xiàn)整個(gè)過(guò)程全自動(dòng),因此如何將上述所有流程整合到一個(gè)系統(tǒng)內(nèi)實(shí)現(xiàn)完全自動(dòng)化是未來(lái)需要不斷研究的方向。

2、可解釋性

深度學(xué)習(xí)網(wǎng)絡(luò)的一個(gè)缺點(diǎn)便是它的可解釋性差,AutoML在搜索網(wǎng)絡(luò)過(guò)程中同樣存在這個(gè)問(wèn)題。目前還缺乏一個(gè)嚴(yán)謹(jǐn)?shù)目茖W(xué)證明來(lái)解釋

為什么某些操作表現(xiàn)更好,例如

就基于單元結(jié)構(gòu)設(shè)計(jì)的網(wǎng)絡(luò)而言,很難解釋為什么通過(guò)疊加單元結(jié)構(gòu)就能得到表現(xiàn)不錯(cuò)的網(wǎng)絡(luò)結(jié)構(gòu)。另外為何ENAS提出的權(quán)值共享能夠work同樣值得思考。

3、可復(fù)現(xiàn)性

大多數(shù)的AutoML研究工作都只是報(bào)告了其研究成果,很少會(huì)開(kāi)源其完整代碼,有的只是提供了最終搜索得到的網(wǎng)絡(luò)結(jié)構(gòu)而沒(méi)有提供搜索過(guò)程的代碼。另外較多論文提出的方法難以復(fù)現(xiàn),一方面是因?yàn)樗麄冊(cè)趯?shí)際搜索過(guò)程中使用了很多技巧,而這些都沒(méi)有在論文中詳細(xì)描述,另一方面是網(wǎng)絡(luò)結(jié)構(gòu)的搜索存在一定的概率性質(zhì)。因此如何確保AutoML技術(shù)的可復(fù)現(xiàn)性也是未來(lái)的一個(gè)方向。

4、靈活的編碼方式

通過(guò)總結(jié)NAS方法我們可以發(fā)現(xiàn),所有方法的搜索空間都是在人類經(jīng)驗(yàn)的基礎(chǔ)上設(shè)計(jì)的,所以最終得到的網(wǎng)絡(luò)結(jié)構(gòu)始終無(wú)法跳出人類設(shè)計(jì)的框架。例如現(xiàn)如今的NAS無(wú)法憑空生成一種新的類似于卷積的基本操作,也無(wú)法生成像Transformer那樣復(fù)雜的網(wǎng)絡(luò)結(jié)構(gòu)。因此如何定義一種泛化性更強(qiáng),更靈活的網(wǎng)絡(luò)結(jié)構(gòu)編碼方式也是未來(lái)一個(gè)值得研究的問(wèn)題。

5、終身學(xué)習(xí)(lifelong learn)

大多數(shù)的AutoML都需要針對(duì)特定數(shù)據(jù)集和任務(wù)設(shè)計(jì)網(wǎng)絡(luò)結(jié)構(gòu),而對(duì)于新的數(shù)據(jù)則缺乏泛化性。而人類在學(xué)習(xí)了一部分貓狗的照片后,當(dāng)出現(xiàn)未曾見(jiàn)過(guò)的貓狗依然能夠識(shí)別出來(lái)。因此一個(gè)健壯的AutoML系統(tǒng)應(yīng)當(dāng)能夠終身學(xué)習(xí),即既能保持對(duì)舊數(shù)據(jù)的記憶能力,又能學(xué)習(xí)新的數(shù)據(jù)。

標(biāo)簽: 機(jī)器學(xué)習(xí) 深度學(xué)習(xí)技術(shù)

版權(quán)申明:本站文章部分自網(wǎng)絡(luò),如有侵權(quán),請(qǐng)聯(lián)系:west999com@outlook.com
特別注意:本站所有轉(zhuǎn)載文章言論不代表本站觀點(diǎn)!
本站所提供的圖片等素材,版權(quán)歸原作者所有,如需使用,請(qǐng)與原作者聯(lián)系。

上一篇:十個(gè)技巧,讓你成為數(shù)據(jù)分析中的“降維”專家

下一篇:數(shù)據(jù)湖:下一代企業(yè)數(shù)據(jù)倉(cāng)庫(kù)