Intel® VTune™ Amplifier 2014 for Systems

VTune-Amplifier-Systems 2Performance Profiler

  • Collect a rich set of data to tune performance & multi-core scalability
  • Sort, filter and visualize results to quickly find bottlenecks
  • Easy profiling of remote Linux* and Android* targets

 

Overview

Optimize Serial and Parallel Performance

Intel® VTune™ Amplifier is the premier performance profiler for C, C++, Fortran, Assembly and Java*.

Hotspots

Easy
Performance optimization can be difficult, but the performance profiling tool you use shouldn’t be.

Versatile – Rich Set of Performance Profiles
Whether you are tuning for the first time or doing advanced performance optimization, VTune Amplifier provides the data needed to meet a wide variety of tuning needs. Collect a rich set of performance data for hotspots, threading, locks & waits, bandwidth and more.

Productive – Sort, Filter and Visualize
Good data is not enough. You need tools to mine the data and make it easy to understand. Powerful analysis lets you sort, filter and visualize results on the timeline and on your source.


Quotes

“We achieved a significant improvement (almost 2x) even on one core by optimizing the code based on the information provided by Intel® VTune™ Amplifier XE. Good scalability is a result of usage of combination of Intel® TBB and OpenMP parallelization techniques. We achieved over 8x the performance of the previous version on 8 cores and almost 11x the performance on 16 cores.”
Alexey Andrianov, R&D Director Deputy, Mechanical Analysis Division, Mentor Graphics Corporation

“Intel® VTune™ Amplifier XE analyzes complex code and helps us identify bottlenecks rapidly. By using it and other Intel® Software Development Tools, we were able to improve PIPESIM performance up to 10 times compared with the previous software version.”
Rodney Lessard, Senior Scientist, Schlumberger

“The new VTune™ Amplifier XE brings even more capability to an already indispensable tool. The sampling based call stack hotspots is excellent and alone is worthy of the upgrade. We have also been impressed by how the concurrency and Locks and Waits analysis can even provide useful data on complex applications such as Premiere Pro.”
Rich Gerber – Engineering Manager, MediaCore, Adobe Systems Inc.

“The new interface is a joy to use. Intel® VTune Amplifier XE gives us precise, down-to-the-metal performance data that’s invaluable for pinpointing hotspots and evaluating the effect of optimizations”
Daniel Schwarz, Performance Engineer, Nik Software

“Intel® VTune™ Amplifier XE’s timeline is very information intensive.  It organizes the data I need to tune threaded applications.”
Sergey Zaritchny, Software Development Manager, Open Cascade SAS

“Last week, Intel® VTune™ Amplifier XE helped us find almost 3X performance improvement.  This week it helped us improve the performance another 3X.”
Claire Cates, Principal Developer, SAS Institute Inc.

“One of Intel® VTune™ Amplifier XE’s best features is that it is easy to use.  I did not need to read the documentation.”
Richard Shepherd, Software Engineer, ESRI (UK) Limited


Optimize Serial and Parallel Performance

Intel® VTune™ Amplifier is the premier performance profiler for C, C++, Assembly and Java*.


What’s New in 2014?

We continuously release new features in regular updates available to all customers with a current service agreement (one year included with purchase). Just download, install and get all the latest stuff.

For specific details, see our What’s New? summary for each update. This is published for VTune Amplifier XE, but many of the improvements also apply to VTune Amplifier for Systems.

Here is a partial list of new features for VTune Amplifier 2014 for Systems:

More Profiling Data

  • Basic Hotspots for Android. Collect performance data without the need to install drivers or get a special developers release of Android.

Easier to Use

  • Easier remote collection for Linux and Android targets. Use the VTune Amplifier user interface on the host system to collect data from a remote Linux or Android target.

New OS & Processor Support

  • Latest processors
  • Latest Linux distributions

Features

Profile Remote Systems

Configure your host system to collect data from a remote Linux* or Android* target.

Remote_Tabs_Sys-sm

Quickly Locate Code Taking A Lot of CPU Time

Hotspots analysis gives you a sorted list of the functions using a lot of CPU time. This is where tuning will give you the biggest benefit. Click [+] for the call stacks. Double click to see the source.

Now available for Android.

Vtunegr2

See the Results on Your Source

A double click from the function list takes you to the hottest spot in the function.

Vtunegr3

Power & Energy Profiling

Intel® Energy Profiler uses the same productive VTune Amplifier user interface to analyze data. Get all the same analysis and filtering capabilities without having to learn a new interface.

core_1 img

Mine the Data with Timeline Filtering

Select a time range in the timeline to filter out data (e.g., application startup) that masks the information you need. When you select and filter in the timeline, the grid that lists functions using a lot of CPU time updates to show the list filtered for the selected time.

Vtunegr5

Visualise Thread Behavior

See when threads are running and waiting, and when transitions occur. Balance workloads. Find lock contention.

Vtunegr6

Tune Threading with Locks and Waits Analysis

Quickly find a common cause of slow performance in parallel programs: waiting too long on a lock while the cores are underutilized during the wait. Profiles like “basic hotspots” and “locks & waits” use a software collector that works on both Intel® and compatible processors. (Linux only).

05d-Lock_Wait-2013

Low Overhead / High Resolution Hardware Profiling

In addition to “basic hotspots” analysis that works on both Intel® and compatible processors, VTune Amplifier has “advanced hotspots” analysis that uses the Performance Monitoring Unit (PMU) on Intel processors to collect data with very low overhead. Increased resolution (~1 ms vs. ~10 ms) can find hot spots in small functions that run quickly.

Vtunegr8

Collect Kernel Stacks

Hardware profiling now has optional stack collection to identify the calling sequence including kernel stacks. Optimization is easier when you can see how kernel/driver functions are called. (Android: Requires a developers OS release containing drivers.)

Advanced Analysis Like Bandwidth

Preset profiles provide an easy “point and shoot” set-up. Choose Hotspot, Lightweight Hotspot, Concurrency, Locks and Waits or more advanced analyses. No memorizing complex event names. Advanced profiles like memory bandwidth analysis, memory access and branch mispredictions find tuning opportunities. Advanced profiles can optionally collect stacks to identify the calling sequence. (Profiles vary by microarchitecture. Android: Requires a developers OS release containing drivers.)

6-EBS-Predefined

Opportunities Highlighted

The cell is highlighted in pink when there is a potential tuning opportunity. Hover to get suggestions.

Vtunegr10

No special builds for your applications

Use a production build with symbols from your normal compiler.

Low overhead

Accurate results you can count on.

Command line

Automate regression analysis. Simple remote collection.

System Wide Analysis

Tune drivers, kernel modules and multi-process apps. (Android: Requires a developers OS release containing drivers.)

Low Overhead Java* Profiling

Analyze Java or mixed Java and native code.  Results are mapped to the original Java source.  Unlike some Java profilers that instrument the code, VTune Amplifier uses low overhead statistical sampling with either a hardware or software collector.  Hardware collection has extremely low overhead because it uses the on-chip performance monitoring hardware. (Android: Requires a rooted device.)

Analyze User Tasks

The task annotation API is used to annotate your source so VTune Amplifier can display which tasks are executing. For example if you label the stages of your pipeline, they will be marked in the timeline and hovering will reveal details. This makes profiling data much easier to understand.

Tune Inlining with Call Counts

When a function is called frequently it may make sense to “inline” the code and eliminate the overhead of the function call. VTune Amplifier now provides statistical call count data to help you make better inlining decisions. It also displays profile results on the source code, even if the code is inlined, making it easier to interpret profile results.

Support for New Processors

VTune Amplifier is constantly adding support for the latest processors.

“Hot keys” Start and Stop Analysis

Add a short cut to quickly launch performance analysis whenever you see your app running slowly. Program hot keys to start and stop the collection of performance data.

Technical Specifications

For additional information and details on new features, please see the “What’s new?” articles and release notes.

 

 

What do our Customers say about us?

Took me from a complete beginner to a user in very easy steps.

I had just upgraded to Windows 7, and after having Mathcad for several years I found that I could not complete the final stages of installation. After reading help files I finished up confused and really helpless, until I contacted you. THANK YOU.

JD, Dorset, UK

I have tested the program with my instrument. It is now working very well, and I am really very happy with it. Many thanks for all your help indeed. I am deeply impressed by your enthusiastic contributions to it.

JX, Oxford, UK

I am impressed – somebody actually cares enough about customer relations, not only to honour a promise to reply, but to remember the request! I hope your company appreciates you as much as I do.

RE, London, UK

For the time being we are unable to offer the following product ranges although we are currently working hard to increase the number of products we can offer in the future. Please contact us to talk about alternative products that we may be able to offer you.