FASTCF: FPGA-based Accelerator for STochastic-Gradient-Descent-based Collaborative Filtering
暂无分享,去创建一个
Viktor K. Prasanna | Rajgopal Kannan | Yu Min | Shijie Zhou | V. Prasanna | R. Kannan | Shijie Zhou | Yu Min
[1] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[2] Arda Yurdakul,et al. Efficient Implementations of Multi-pumped Multi-port Register Files in FPGAs , 2013, 2013 Euromicro Conference on Digital System Design.
[3] Dong Yu,et al. On parallelizability of stochastic gradient descent for speech DNNS , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Inderjit S. Dhillon,et al. Parallel matrix factorization for recommender systems , 2014, Knowl. Inf. Syst..
[5] Vaclav Petricek,et al. Recommender System for Online Dating Service , 2007, ArXiv.
[6] Ying Liu,et al. An efficient parallel collaborative filtering algorithm on multi-GPU platform , 2014, The Journal of Supercomputing.
[7] Rajesh Gupta,et al. Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs , 2017, FPGA.
[8] CARLOS A. GOMEZ-URIBE,et al. The Netflix Recommender System , 2015, ACM Trans. Manag. Inf. Syst..
[9] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[10] Weimin Zheng,et al. Exploring the Hidden Dimension in Graph Processing , 2016, OSDI.
[11] Keshav Pingali,et al. Stochastic gradient descent on GPUs , 2015, GPGPU@PPoPP.
[12] Viktor K. Prasanna,et al. Accelerating Large-Scale Single-Source Shortest Path on FPGA , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.
[13] Taghi M. Khoshgoftaar,et al. A Survey of Collaborative Filtering Techniques , 2009, Adv. Artif. Intell..
[14] Darshika G. Perera,et al. An efficient embedded multi-ported memory architecture for next-generation FPGAs , 2017, 2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP).
[15] Margaret Martonosi,et al. Graphicionado: A high-performance and energy-efficient accelerator for graph analytics , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[16] Gopalakrishna Hegde,et al. CaffePresso: An optimized library for Deep Learning on embedded accelerator-based platforms , 2016, 2016 International Conference on Compliers, Architectures, and Sythesis of Embedded Systems (CASES).
[17] Kathryn Fraughnaugh,et al. Introduction to graph theory , 1973, Mathematical Gazette.
[18] Viktor K. Prasanna,et al. Optimizing memory performance for FPGA implementation of pagerank , 2015, 2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig).
[19] Liana L. Fong,et al. Faster and Cheaper: Parallelizing Large-Scale Matrix Factorization on GPUs , 2016, HPDC.
[20] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[21] Yu Wang,et al. A Reconfigurable Computing Approach for Efficient and Scalable Parallel Graph Exploration , 2012, 2012 IEEE 23rd International Conference on Application-Specific Systems, Architectures and Processors.
[22] Pradeep Dubey,et al. GraphMat: High performance graph analytics made productive , 2015, Proc. VLDB Endow..
[23] Viktor K. Prasanna,et al. A Hybrid Approach for Mapping Conjugate Gradient onto an FPGA-Augmented Reconfigurable Supercomputer , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.
[24] Jing Li,et al. Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network , 2017, FPGA.
[25] Yehuda Koren,et al. Matrix Factorization Techniques for Recommender Systems , 2009, Computer.
[26] J. Gregory Steffan,et al. Multi-ported memories for FPGAs via XOR , 2012, FPGA '12.
[27] Chao Wang,et al. An FPGA-Based Accelerator for Neighborhood-Based Collaborative Filtering Recommendation Algorithms , 2015, 2015 IEEE International Conference on Cluster Computing.
[28] Nancy M. Amato,et al. Faster Parallel Traversal of Scale Free Graphs at Extreme Scale with Vertex Delegates , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[29] James C. Hoe,et al. GraphGen: An FPGA Framework for Vertex-Centric Graph Computation , 2014, 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines.
[30] Viktor K. Prasanna,et al. High-Throughput and Energy-Efficient Graph Processing on FPGA , 2016, 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[31] Haibo Chen,et al. Bipartite-Oriented Distributed Graph Partitioning for Big Learning , 2014, Journal of Computer Science and Technology.
[32] Pradeep Dubey,et al. Navigating the maze of graph analytics frameworks using massive graph datasets , 2014, SIGMOD Conference.