Use of SIMD Vector Operations to Accelerate Application Code Performance on Low-Powered ARM and Intel Platforms

Loading...
Thumbnail Image
File version
Author(s)
Mitra, G
Johnston, B
Rendell, AP
McCreath, E
Zhou, J
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)

Yunquan Zhang, Xiaosong Ma

Date
2013
Size

263463 bytes

File type(s)

application/pdf

Location

Boston, United States

License
Abstract

Augmenting a processor with special hardware that is able to apply a Single Instruction to Multiple Data(SIMD) at the same time is a cost effective way of improving processor performance. It also offers a means of improving the ratio of processor performance to power usage due to reduced and more effective data movement and intrinsically lower instruction counts. This paper considers and compares the NEON SIMD instruction set used on the ARM Cortex-A series of RISC processors with the SSE2 SIMD instruction set found on Intel platforms within the context of the Open Computer Vision (OpenCV) library. The performance obtained using compiler auto-vectorization is compared with that achieved using hand-tuning across a range of five different benchmarks and ten different hardware platforms. On the ARM platforms the hand-tuned NEON benchmarks were between 1.05נand 13.88נfaster than the auto-vectorized code, while for the Intel platforms the hand-tuned SSE benchmarks were between 1.34נand 5.54נfaster.

Journal Title
Conference Title

Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013

Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2013 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Item Access Status
Note
Access the data
Related item(s)
Subject

Computer System Architecture

Persistent link to this record
Citation