2013-12-23

… not suprisingly as it is built on the same micro-architecture. Even Intel will manufacture Cortex-A53 based SoCs for Altera (Stratix 10 FPGA SoCs) in 2015 on its leading edge Tri-Gate (FinFET) 14nm process.

With MediaTek MT6592-based True Octa-core superphones are on the market to beat Qualcomm Snapdragon 800-based ones [‘Experiencing the Cloud’, Dec 21, 2013] MediaTek will follow up with a 4G LTE MT6595 version in January, and with a 64-bit version based on Cortex-A53 instead of Cortex-A7 in H2 CY14. In this way it will be able to compete head-on with the new Qualcomm Snapdragon 410 in the most lucrative high-volume market.

According to 大陸4G啟動 聯發科快攻 [Commercial Times, Dec 10, 2013]: “MediaTek MT6590′s first 4G modem chip is expected to begin shipping next month, in addition to 4G systems integration single chip (SoC) MT6595 has appeared earlier this month in the customer’s specification sheet, and 8-core as the main design, not difficult to see MediaTek ambition to expand high-end market.”

MediaTek delivering 4G LTE chips for verification, say paper [DIGITIMES, Dec 18, 2013]

MediaTek reportedly has delivered its first 4G LTE chip, the MT6590, to potential clients for verification. The chips are expected to begin generating revenues for the IC design house in the first quarter of 2014, according to a Chinese-language Liberty Times report. The MT6590 supports five modes and 10 frequency bands.

The news echoes earlier remarks by MediaTek president Hsieh Ching-chiang stating the company plans to launch 4G chips at year-end 2013 with end-market devices powered by the 4G chips to be available in the first quarter of 2014, the paper added.

Citing data from JPMorgan Chase, the paper said shipments of MediaTek’s first 8-core chip, the MT6592, are higher than expected and shipment momentum is likely to continue into the first quarter of 2014.

The latest news: Chipset vendors to showcase 64-bit smartphone solutions at CES 2014 [DIGITIMES, Dec 23, 2013]

Chipset players including Qualcomm, Nvidia, Marvell Technology and Broadcom all are expected to showcase 64-bit processors for smartphone applications at the upcoming CES 2014 trade show, a move which will add pressure on Taiwan-based MediaTek in its efforts to expand market share with its newly released 8-core CPUs, according to industry sources.

Qualcomm has already unveiled a 64-bit-chip, the Snapdragon 410, and is expected to begin sampling in the first half of 2014, according to the company.

Nvidia, which is familiar with 64-bit computing architectures, is expected to start volume production of 64-bit chips for smartphones in the first half of 2014 at the earliest, said industry sources.

Marvell and Broadcom are also expected to highlight their 64-bit chips at CES 2014, kicking off competition in the 64-bit chipset segment, note the sources.

Meanwhile, the vendors, as well as China-based chipset suppliers Spreadtrum Communications and RDA Microelectronics, will also exert efforts to take market share from MediaTek in the entry-level to mid-range chipset segment in 2014, commented the sources.

From: 64-bit smartphones to be ushered in 2014, say sources [DIGITIMES, Dec 11, 2013]

… Qualcomm has also claimed that the Snapdragon 410 will support all major operating systems, including Android, Windows Phone and Firefox OS and that Qualcomm Reference Design versions of the processor will be available to enable rapid development time and reduce OEM R&D, designed to provide a comprehensive mobile device platform. However, the observers noted that the Snapdragon 410 chips are aiming at the mid-range LTE smartphone segment, particularly the sub-CNY1,000 (US$165) sector in China. The launch of the mid-range 64-bit Snapdragon chips also aims to widen its lead against Taiwan-based rival MediaTek in the China market, the sources added. Qualcomm said the Snapdragon 410 processor is expected to be in commercial devices in the second half of 2014. …

Samsung Electronics is also believed to be working on its own 64-bit CPUs in house and expected to launch 64-bit capable flagship models in the first half of 2014 at the earliest, said the observers.

The 64-bit versions of CPUs from MediaTek, Broadcom and Nvidia are likely to come in late 2014 or in 2015, added the sources.

Google is expected to accelerate the upgrading of its Android platform, providing an environment for software developers to work on related 64-bit applications, commented the sources.

Taiwan IC suppliers developing chips for MediaTek smartphone solutions [DIGITIMES, Dec 18, 2013]

MediaTek’s growing shipments of smartphone solutions, which are expected to top 200 million units in 2013 and 300 million units in 2014, have encouraged Taiwan-based suppliers of LCD driver ICs, power management ICs, ambient light sensors, gyroscopes, touchscreen controller ICs and MEMS microphones to develop chips that can be incorporated into these smartphone solutions, according to industry sources.

MediaTek has been focusing its R&D efforts on developments of 4- and 8-core and 4G CPUs as well as wireless chips in order to maintain its competitiveness, while relying on other IC vendors to complete its smartphone solution platforms, the sources noted.

With MediaTek’s smartphone solution shipments expected to reach 30 million units a month in 2014, any suppliers which can deliver IC parts for MediaTek’s smartphone platforms will see their revenues and profits grow substantially in 2014, the sources said.

Qualcomm Technologies Introduces Snapdragon 410 Chipset with Integrated 4G LTE World Mode for High-Volume Smartphones [press release, Dec 9, 2013]

4G LTE, 64-Bit Processing Expands Qualcomm Technologies’ Global Product Offerings and Reference Design Program

SAN DIEGO – December 09, 2013 – Qualcomm Incorporated (NASDAQ: QCOM) today announced that its wholly-owned subsidiary, Qualcomm Technologies, Inc., has introduced the Qualcomm® Snapdragon™ 410 chipset with integrated 4G LTE World Mode. The delivery of faster connections is important to the growth and adoption of smartphones in emerging regions, and Qualcomm Snapdragon chipsets are poised to address the needs of consumers as 4G LTE begins to ramp in China.

The new Snapdragon 410 chipsets are manufactured using 28nm process technology. They feature processors that are 64-bit capable along with superior graphics performance with the Adreno 306 GPU, 1080p video playback and up to a 13 Megapixel camera. Snapdragon 410 chipsets integrate 4G LTE and 3G cellular connectivity for all major modes and frequency bands across the globe and include support for Dual and Triple SIM. Together with Qualcomm RF360 Front End Solution, Snapdragon 410 chipsets will have multiband and multimode support. Snapdragon 410 chipsets also feature Qualcomm Technologies’ Wi-Fi, Bluetooth, FM and NFC functionality, and support all major navigation constellations: GPS, GLONASS, and China’s new BeiDou, which helps deliver enhanced accuracy and speed of Location data to Snapdragon-enabled handsets.

The chipset also supports all major operating systems, including the Android, Windows Phone and Firefox operating systems. Qualcomm Reference Design versions of the processor will be available to enable rapid development time and reduce OEM R&D, designed to provide a comprehensive mobile device platform. The Snapdragon 410 processor is anticipated to begin sampling in the first half of 2014 and expected to be in commercial devices in the second half of 2014.

Qualcomm Technologies also announced for the first time the intention to make 4G LTE available across all of the Snapdragon product tiers. The Snapdragon 410 processor gives the 400 product tier several 4G LTE options for high-volume mobile devices, as the third LTE-enabled solution in the product tier. By offering 4G LTE variants to its entry level smartphone lineup, Qualcomm Technologies ensures that emerging regions are equipped for this transition while also having every major 2G and 3G technology available to them. Qualcomm Technologies offers OEMs and operators differentiation through a rich feature set upon which to build innovative high-volume smartphones for budget-conscious consumers.

“We are excited to bring 4G LTE to highly affordable smartphones at a sub $150 ( ̴ 1,000 RMB) price point with the introduction of the Qualcomm Snapdragon 410 processor,” said Jeff Lorbeck, senior vice president and chief operating officer, Qualcomm Technologies, China. “The Snapdragon 410 chipset will also be the first of many 64-bit capable processors as Qualcomm Technologies helps lead the transition of the mobile ecosystem to 64-bit processing.”

Qualcomm Technologies will release the Qualcomm Reference Design (QRD) version of the Snapdragon 410 processor with support for Qualcomm RF360™ Front End Solution. The QRD program offers Qualcomm Technologies’ leading technical innovation, easy customization options, the QRD Global Enablement Solution which features regional software packages, modem configurations, testing and acceptance readiness for regional operator requirements, and access to a broad ecosystem of hardware component vendors and software application developers. Under the QRD program, customers can rapidly deliver differentiated smartphones to value-conscious consumers. There have been more than 350 public QRD-based product launches to date in collaboration with more than 40 OEMs in 18 countries.

Note that just 18 days before that there was the news that Qualcomm Technologies Announces Next Generation Qualcomm Snapdragon 805 “Ultra HD” Processor [press release, Nov 20, 2013]

Mobile Technology Leader Announces its Highest Performance Processor Designed to Deliver the Highest Quality Mobile Video, Camera and Graphics to Qualcomm Snapdragon 800 Tier

NEW YORK – November 20, 2013 – Qualcomm Incorporated (NASDAQ: QCOM) today announced that its subsidiary, Qualcomm Technologies, Inc., introduced the next generation mobile processor of the Qualcomm® Snapdragon™ 800 tier, the Qualcomm Snapdragon 805 processor, which is designed to deliver the highest-quality mobile video, imaging and graphics experiences at Ultra HD (4K) resolution, both on device and via Ultra HD TVs. Featuring the new Adreno 420 GPU, with up to 40 percent more graphics processing power than its predecessor, the Snapdragon 805 processor is the first mobile processor to offer system-level Ultra HD support, 4K video capture and playback and enhanced dual camera Image Signal Processors (ISPs), for superior performance, multitasking, power efficiency and mobile user experiences.

The Snapdragon 805 processor is Qualcomm Technologies’ newest and highest performing Snapdragon processor to date, featuring:

- Blazing fast apps and web browsing and outstanding performance: Krait 450 quad-core CPU, the first mobile CPU to run at speeds of up to 2.5 GHz per core, plus superior memory bandwidth support of up to 25.6 GB/second that is designed to provide unprecedented multimedia and web browsing performance.

- Smooth, sharp user interface and games support Ultra HD resolution: The mobile industry’s first end-to-end Ultra HD solution with on-device display concurrent with output to HDTV; features Qualcomm Technologies’ new Adreno 420 GPU, which introduces support for hardware tessellation and geometry shaders, for advanced 4K rendering, with even more realistic scenes and objects, visually stunning user interface, graphics and mobile gaming experiences at lower power.

- Fast, seamless connected mobile experiences: Custom, efficient integration with either the Qualcomm® Gobi™ MDM9x25 or the Gobi MDM9x35 modem, powering superior seamless connected mobile experiences. The Gobi MDM9x25 chipset announced in February 2013 has seen significant adoption as the first embedded, mobile computing solution to support LTE carrier aggregation and LTE Category 4 with superior peak data rates of up to 150Mbps. Additionally, Qualcomm’s most advanced Wi-Fi for mobile, 2-stream dual-band Qualcomm® VIVE™ 802.11ac, enables wireless 4K video streaming and other media-intensive applications. With a low-power PCIe interface to the QCA6174, tablets and high-end smartphones can take advantage of faster mobile Wi-Fi performance (over 600 Mbps), extended operating range and concurrent Bluetooth connections, with minimal impact on battery life.

- Ability to stream more video content at higher quality using less power: Support for Hollywood Quality Video (HQV) for video post processing, first to introduce hardware 4K HEVC (H.265) decode for mobile for extremely low-power HD video playback.

- Sharper, higher resolution photos in low light and advanced post-processing features: First Gpixel/s throughput camera support in a mobile processor designed for a significant increase in camera speed and imaging quality. Sensor processing with gyro integration enables image stabilization for sharper, crisper photos. Qualcomm Technologies is the first to announce a mobile processor with advanced, low-power, integrated sensor processing, enabled by its custom DSP, designed to deliver a wide range of sensor-enabled mobile experiences.

“Using a smartphone or tablet powered by Snapdragon 805 processor is like having an UltraHD home theater in your pocket, with 4K video, imaging and graphics, all built for mobile,” said Murthy Renduchintala, executive vice president, Qualcomm Technologies, Inc., and co-president, QCT. “We’re delivering the mobile industry’s first truly end-to-end Ultra HD solution, and coupled with our industry leading Gobi LTE modems and RF transceivers, streaming and watching content at 4K resolution will finally be possible.”

The Snapdragon 805 processor is sampling now and expected to be available in commercial devices by the first half of 2014.

The original value proposition was presented in the brief Brian Jeff highlights the ARM® Cortex™-A53 processor [ARMflix YouTube channel, Oct 30, 2012] video as follows

Brian Jeff highlights the ARM® Cortex™-A53 processor, ARM’s most efficient application processor ever, delivering today’s mainstream smartphone experience in a quarter of the power in the respective process nodes.

The Top 5 Things to Know about Cortex-A53 [Brian Jeff on ‘ARM Connected Community’, Oct 28, 2013]

The Cortex-A53 was introduced to the market in October 2012, delivering the ARMv8 instruction set and significantly increased performance in a highly efficient power and area footprint. It is available for licensing now, and will be deployed in silicon in early 2014 by multiple ARM partners. There are a few key aspects of the Cortex-A53 that developers, OEMs, and SoC designers should know:

1. ARM low power / high efficiency heritage

The ARM9 is the most licensed processor in ARM’s history with over 250 licenses sold. It identified a very important power/cost sweet spot.The Cortex-A5 (launched in 2009) was designed to fit in the CPU same power and area footprint,

    ARM926-based feature phone (Nokia E60).

while delivering significantly higher performance and power-efficiency, and bring it to modern ARMv7 feature set – software compatibility with the high end of the processor roadmap (then Cortex-A9)



The Cortex-A53 is built around a simple pipeline, 8 stages long with in-order execution like the Cortex-A7 and Cortex-A5 processors that preceded it. An instruction traversing a simple pipeline requires fewer registers and switches less logic to fetch, decode, issue, execute, and write back the results than a more complex pipeline microarchitecture. Simpler pipelines are smaller and lower power. The high efficiency Cortex-A CPU product line, consisting of Cortex-A5, Cortex-A7, and Cortex-A53, takes a design approach prioritizing efficiency first, then seeking as much performance as possible at the maximum efficiency. The added performance in each successive generation in this series comes from advances in the memory system, increasing dual-issue capability, expanded internal busses, and improved branch prediction.

2. ARM v8-A Architecture

The Cortex-A53 is fully compliant with the ARMv8-A architecture, which is the latest ARM architecture and introduces support for 64b operation while maintaining 100% backward compatibility with the broadly deployed ARMv7 architecture. The processor can switch between AArch32 and AArch64 modes of operation to allow 32bit apps and 64bit apps to run together on top of a 64bit operating system. This dual execution state support allows maximum flexibility for developers and SoC designers in managing the rollout of 64bit support in different markets. ARMv8-A brings additional features (more registers, new instructions) that bring increased performance and Cortex-A53 is able to take advantage of these.

3. Higher performance than Cortex-A9: smaller and more efficient too

The Cortex-A9 features an out-of-order pipeline, dual issue capability, and a longer pipeline than Cortex-A53 that enables 15% higher frequency operation. However the Cortex-A53 achieves higher single thread performance by pushing a simpler design farther – some of the key factors enabling the performance of the Cortex-A53 include the integrated low latency level 2 cache, the larger 512 entry main TLB, and the complex branch predictor. The Cortex-A9 has set the bar for the high end of the smartphone market through 2012 – by matching and exceeding that level of performance in a smaller footprint and power budget, the Cortex-A53 delivers performance to entry level devices that was previously enjoyed by high-end flagship mobile devices – in a lower power budget and at lower cost. The graph below compares the single thread performance of the high efficiency Cortex-A processors with the Cortex-A9. At the same frequency, Cortex-A53 delivers more than 20% higher instruction throughput than the Cortex-A9 for representative workloads.

4. Supports big.LITTLE with Cortex-A57

The Cortex-A53 is architecturally identical to the higher performance Cortex-A57 processor, and can be integrated with it in a big.LITTLE processor subsystem. big.LITTLE enables peak performance and extreme efficiency by distributing work to the right-sized processor for the task at hand.

It is described in more detail here – Ten Things to Know About big.LITTLE

The diagram above shows Cortex-A53 combined with Cortex-A57 and a Mali-T628Graphics processor in an example system. The CCI-400 cache coherent interconnect allows the 2 CPU clusters to be combined in a seamless way that allows software to manage the task allocation in a highly transparent way, as described in <link – software>. The big.LITTLE system enables peak performance at low average power.

Cortex-A53 in ideal for use in a standalone use scenario, delivering excellent performance at very low power and area enabling new features to be supported in the low cost smartphone segments  Our new LITTLE processor packs a performance punch.

Read more about that in a somewhat humorous blog on Cortex-A53 from the product launch – ARM Cortex-A53 — Who You callin’ LITTLE?

5. Extensive feature set for broad application support

The Cortex-A53 includes a feature set that allows it to be configured and optimized through physical implementation tailored to mobile SoCs and to  scalable enterprise systems

Mobile Features

Enterprise Features

AMBA 4 ACE Coherent bus

big.LITTLE processing (2 CPU Clusters) with CCI-400 interconnect

AMBA5 CHI Coherent bus

Scalable to 4 or more coherent CPU clustersfor low-cost servers or networking infrastructure devices.

16-core systems with  CCN-504 or 32-core systems with CCN-508 – all on a single silicon die.

Small area, low power design

Optimized for <150mW envelope

Small area, low power design.

Likely still optimized for 150 mW. However, higher performance implementations can be used

ECC, parity available, but configurable if not needed

ECC and parity protection required for enterprise applications

See also:

ARM Announces New High-Performance System IP to Address Demand for Energy-Efficient ‘Many-core’ Solutions for the Enterprise Market [press release, Oct 10, 2012]: “To address the significant increase in data over the next 10-15 years, and the demand for more energy-efficient network infrastructure and servers, ARM has announced the ARM® CoreLink™ CCN-504 cache coherent network. This advanced system intellectual property (IP) can deliver up to one terabit of usable system bandwidth per second.”

ARM Launches Cortex-A50 Series, the World’s Most Energy-Efficient 64-bit Processors [press release, Oct 30, 2012]

ARM Announces POP IP for Cortex-A50 Series Processors on TSMC 28nm HPM and 16nm FinFET Processes [press release, April 9, 2013]

ARM Announces AMBA 5 CHI Specification to Enable High Performance, Highly Scalable System on Chip Technology [press release, June 3, 2013]

Huawei announces global agreement to licence ARMv8 architecture – Agreement underlines Huawei’s commitment to IPR and the UK [Huawei press release, Sept 4, 2013]

From: AMD Details Embedded Product Roadmap [AMD press release, Sept 9, 2013]:
“ ‘Hierofalcon’ CPU SoC ‘Hierofalcon’ is the first 64-bit ARM-based platform from AMD targeting embedded data center applications, communications infrastructure and industrial solutions. It will include up to eight ARM Cortex™-A57 CPUs expected to run up to 2.0 GHz, and provides high-performance memory with two 64-bit DDR3/4 channels with error correction code (ECC) for high reliability applications. The highly integrated SoC includes 10 Gb KR Ethernet and PCI-Express Gen 3 for high-speed network connectivity, making it ideal for control plane applications. The “Hierofalcon” series also provides enhanced security with support for ARM TrustZone® technology and a dedicated cryptographic security co-processor, aligning to the increased need for networked, secure systems. “Hierofalcon” is expected to be sampling in the second quarter of 2014 with production in the second half of the year.”

MediaTek extends partnership with ARM to drive next-generation mobile and consumer technology [joint press release, Oct 8, 2013]: “MediaTek has acquired a broad license to Cortex-A50 Series processor cores and the next generation of ARM Mali graphics processing Unit (GPU) solutions.”

Broadcom Announces Server-Class ARMv8-A Multi-Core Processor Architecture –Optimized to Deliver Industry’s Highest Performance for Next-Generation Networking and Communications Applications [Broadcom press release, Oct 15, 2013]:

Quad-issue, quad-threaded 64-bit ARMv8-A core with superscalar out-of-order execution delivers true server-class performance

Core enables 3-GHz performance in the advanced 16-nm FINFET process node

Partnership with ARM aims to define and develop an open, ISA-independent Network Function Virtualization (NFV) software environment

Coherent Interconnect Technology Supports Exponential Data Flow Growth [Ian Forsyth on ‘ARM Connected Community’, Oct 26, 2013]: “Recently I presented “Coherent Interconnect Technology Supports Exponential Data Flow Growth” at the Linley Processor conference in Santa Clara, CA where I announced a new ARM coherent interconnect product for enterprise applications, the CoreLink CCN-508. … CoreLink CCN-508 is a cache coherent network providing support for up to 32 fully coherent cores. Supported cores include Cortex-A57 and Cortex-A53.” From: “ARM is just beginning to engage with customers for the CCN-508, and it expects the first SoCs using this IP to enter production in late 2014 or early 2015.”

Rockchip extends partnership with arm by subscription license of ATM processor and GPU technologies [press release, Nov 5, 2013]

ARM Cortex-A53 — Who You callin’ LITTLE? [Brian Jeff on ‘ARM Connected Community’, Oct 30, 2013]

I may only weigh in at just over half a square millimeter on die, but I can handle a heavy workload and I pack quite a processing punch, and frankly I’m tired of the lack of respect I get as a “LITTLE” processor. I am the CortexTM-A53 processor from ARM, some of you may have previously known me by my code name “Apollo”. Despite being three times as efficient as my big brother, the Cortex-A57, and delivering more performance than today’s current heavyweight champ the Cortex-A9, I am often overlooked.

Processor designers and consumers alike look to the big core, the top end MHz figure, and the number of big processors in the system when they evaluate devices like premium smartphones and tablets. What they don’t realize is that I’m the one running during most of the time the mobile applications cluster is awake, and I’m the one that will enable improvements in battery life even as delivered peak performance increases dramatically. It is high time that the LITTLE processor gets the respect and appreciation that is due.

I’m speaking not just for myself here, but for my close cousin the Cortex-A7. We’re built from the same DNA, so to speak, sharing the same 8-stage pipeline and in-order structure. We both consume about the same level of power on our respective production process nodes, and although I bring added performance and support 64-bit, we are both quite alike. We are 100% code compatible for 32-bit code after all. And yet we don’t get the respect we deserve. It is an injustice, really.

In high-end mobile devices, my cousin the Cortex-A7 is always telling me how everyone wants to hear about how fast the Cortex-A15 is in the system, how many Cortex-A15 CPUs are in the system, and how many MaliTM GPU cores are built into the SoC. They don’t even notice if there are four Cortex-A7 cores in the design capable of delivering plenty of performance — more performance than a lot of smartphones in the market today.  They just expect battery life to improve without giving any credit to the LITTLE processor that makes it possible.

Well they will soon see… big.LITTLE processors are coming into the market next year, nearly sampling already, and the capability of the LITTLE processor will be in full view, let me tell you.

Oh, and another thing — in the enterprise space, what they call “big Iron” — there is almost no recognition of the worth of small processors there. Sure, new designs are considering LITTLE processors in many-core topologies with ARM’s CoreLinkTM Cache Coherent Network (CCN) interconnect, but look at the products that are deployed today — they are mostly based on big cores, the bigger the better. Nowhere is this more evident than in the server space, where IT managers brag about how big their server racks are. Just wait and see. New server processors are being developed based on ARM, where even my big brother the Cortex-A57 is about an order of magnitude smaller and lower power than the incumbent processors. I’m in a different weight class altogether, but I can hang with the big boys on total performance. Purpose-built servers using lots of Cortex-A53 cores can deliver even more aggregate performance in a given power and thermal envelope. But are we LITTLE cores getting much attention in servers today? No. Well just watch and see. In 2015 when the first Cortex-A50 series 64-bit processors are built for lower power servers, you won’t be able to help but notice that LITTLE processors can get key jobs done in a lot less energy.

So I may be the same size relative to my Cortex-A57 big brother as the Cortex-A7 is to the Cortex-A15, but OEMs and consumers better not underestimate me. I’ve been going through intensive work these past 2 years to build up my muscles in the places that count: my SIMD performance is way up thanks to the improved NEONTM architectural support in ARMv8 and a much wider NEON datapath. I can dual-issue almost anything. My memory system is also juiced up, as is my branch predictor capability. That’s how I can pack a bigger punch than Cortex-A9 at around a quarter the power in our respective process nodes.

That’s all I’m saying, man. You gotta respect the LITTLE processor.

Peace.

AnandTech Live with ARM’s Peter Greenhalgh [anandshimpi YouTube channel, Dec 20, 2013]

A live chat with ARM Fellow and Lead Architect on Cortex A53, Peter Greenhalgh

From the earlier: Answered by the Experts: ARM’s Cortex A53 Lead Architect, Peter Greenhalgh [AnandTech, Dec 17, 2013]

Cortex-A53 has been designed to be able to easily replace Cortex-A7. For example, Cortex-A7 supports the same bus-interface standards (and widths) as Cortex-A7 which allows a partner who has already built a Cortex-A7 platform to rapidly convert to Cortex-A53.

A Cortex-A53 cluster only supports up to 4-cores. If more than 4-cores are required in a platform then multiple clusters can be implemented and coherently connected using an interconnect such as CCI-400. The reason for not scaling to 8-cores per cluster is that the L2 micro-architecture would need to either compromise energy-efficiency in the 1-4 core range to achieve performance in the 4-8 core range, or compromise performance in the 4-8 core range to maximise energy-efficiency in the 1-4 core range.

We expect to see a range of platform configurations using Cortex-A53. A 4+4 Cortex-A53 platform configuration is fully supported and a logical progression from a 4+4 Cortex-A7 platform.

We’re pretty happy with the 8-stage (integer) Cortex-A53 pipeline and it has served us well across the Cortex-A53, Cortex-A7 and Cortex-A5 family. So far it’s scaled nicely from 65nm to 16nm and frequencies approaching 2GHz so there’s no reason to think this won’t hold true in the future.

Cortex-A53 has the same pipeline length as Cortex-A7 so I would expect to see similar frequencies when implemented on the same process geometry. Within the same pipeline length the design team focussed on increasing dual-issue, in-order performance as far as we possibly could. This involved symmetric dual-issue of most of the instruction set, more forwarding paths in the datapaths, reduced issue latency, larger & more associative TLB, vastly increased conditional and indirect branch prediction resources and expanded instruction and data prefetching. The result of all these changes is an increase in SPECInt-2000 performance from 0.35-SPEC/Mhz on Cortex-A7 to 0.50-SPEC/Mhz on Cortex-A53. This should provide a noticeable performance uplift on the next generation of smartphones using Cortex-A53.

Due to the power-efficiency of Cortex-A53 on a 28nm platform, all 4 cores can comfortably be executing at 1.4GHz in less than 750mW which is easily sustainable in a current smartphone platform even while the GPU is in operation.

The performance per watt (energy efficiency) of Cortex-A53 is very similar to Cortex-A7. Certainly within the variation you would expect with different implementations. Largely this is down to learning from Cortex-A7 which was applied to Cortex-A53 both in performance and power.

Intel to make ARM Processors, firstly 64bit 14nm ARM Cortex-A53 ARMv8 for Altera [Charbax YouTube channel, Oct 31, 2013]

Nathan Brookwood is an Analyst and Research Fellow at Insight 64, he is the source for the Forbes article http://www.forbes.com/sites/jeanbaptiste/2013/10/29/exclusive-intel-opens-fabs-to-arm-chips/ The new Intel CEO has changed Intel’s policy, now deciding that it’s actually OK to manufacture ARM Processors in their Fab. Possibly now Intel is also going to make ARM Processors for Apple, Qualcomm, Nvidia, AMD or someone else, possibly also even for themselves, possibly releasing a whole range of Intel ARM Processors to launch if Intel cares to have some reach into Smartphones, Tablets, ARM Laptops, Smart TVs, ARM Desktops, ARM Servers, I think Intel doesn’t need to not contribute to each of those ARM categories themselves too and by fabricating for Chip Makers, it depends what the new Intel CEO finds to be the thing to do for them.

Altera Announces Quad-Core 64-bit ARM Cortex-A53 for Stratix 10 SoCs [press release, Oct 29, 2013]

Manufactured on Intel’s 14 nm Tri-Gate Process, Altera Stratix® 10 SoCs Will Deliver Industry’s Most Versatile Heterogeneous Computing Platform

Santa Clara, Calif., ARM TechCon, October 29, 2013—Altera Corporation (NASDAQ: ALTR) today announced that its Stratix 10 SoC devices, manufactured on Intel’s 14 nm Tri-Gate process, will incorporate a high-performance, quad-core 64-bit ARM Cortex™-A53 processor system, complementing the device’s floating-point digital signal processing (DSP) blocks and high-performance FPGA fabric. Coupled with Altera’s advanced system-level design tools, including OpenCL, this versatile heterogeneous computing platform will offer exceptional adaptability, performance, power efficiency and design productivity for a broad range of applications, including data center computing acceleration, radar systems and communications infrastructure.

From: Intel fabs Altera’s Stratix 10 FPGA with four ARM A53 cores [SemiAccurate, Nov 5, 2013]: Altera representatives at Techcon said that the beast would tape out in Q4/2014 or about a year from now.

From: Pigs Fly. Altera Goes with ARM on Intel 14nm [SemiWiki.com, Oct 29, 2013]:

I asked Altera about the schedule for all of this. Currently they have over 100 customers using the beta release of their software to model their applications in the Stratix 10. They have taped out a test-chip that is currently in the Intel fab. In the first half of next year they will have a broader release of the software to everyone. They will tape out the actual designs late in 2014 and have volume production starting in early 2015.

Why did they pick this processor? It has the highest power efficiency of any 64-bit processor. Plus it is backwards compatible with previous Altera families which used (32-bit) ARM Cortex-A9. The A53 has a 32-bit mode that is completely binary compatible with the A9. As I reported last week from the Linley conference, ARM is on a roll into communications infrastructure, enterprise and datacenter so there is a huge overlap between the target markets for the A53 and the target markets for the Stratix 10 SoCs.

The ARM Cortex-A53 processor, the first 64-bit processor used on a SoC FPGA, is an ideal fit for use in Stratix 10 SoCs due to its performance, power efficiency, data throughput and advanced features. The Cortex-A53 is among the most power efficient of ARM’s application-class processors, and when delivered on the 14 nm Tri-Gate process will achieve more than six times more data throughput compared to today’s highest performing SoC FPGAs. The Cortex-A53 also delivers important features, such as virtualization support, 256TB memory reach and error correction code (ECC) on L1 and L2 caches. Furthermore, the Cortex-A53 core can run in 32-bit mode, which will run Cortex-A9 operating systems and code unmodified, allowing a smooth upgrade path from Altera’s 28 nm and 20 nm SoC FPGAs.

“ARM is pleased to see Altera adopting the lowest power 64-bit architecture as an ideal complement to DSP and FPGA processing elements to create a cutting-edge heterogeneous computing platform,” said Tom Cronk, executive vice president and general manager, Processor Division, ARM. “The Cortex-A53 processor delivers industry-leading power efficiency and outstanding performance levels, and it is supported by the ARM ecosystem and its innovative software community.”

Leveraging Intel’s 14 nm Tri-Gate process and an enhanced high-performance architecture, Altera Stratix 10 SoCs will have a programmable-logic performance level of more than 1GHz; two times the core performance of current high-end 28 nm FPGAs.

“High-end networking and communications infrastructure are rapidly migrating toward heterogeneous computing architectures to achieve maximum system performance and power efficiency,” said Linley Gwennap, principal analyst at The Linley Group, a leading embedded research firm. “What Altera is doing with its Stratix 10 SoC, both in terms of silicon convergence and high-level design tool support, puts the company at the forefront of delivering heterogeneous computing platforms and positions them well to capitalize on myriad opportunities.”

By standardizing on ARM processors across its three-generation SoC portfolio, Altera will offer software compatibility and a common ARM ecosystem of tools and operating system support. Embedded developers will be able to accelerate debug cycles with Altera’s SoC Embedded Design Suite (EDS) featuring the ARM Development Studio 5 (DS-5™) Altera® Edition toolkit, the industry’s only FPGA-adaptive debug tool, as well as use Altera’s software development kit (SDK) for OpenCL to create heterogeneous implementations using the OpenCL high-level design language.

“With Stratix 10 SoCs, designers will have a versatile and powerful heterogeneous compute platform enabling them to innovate and get to market faster,” said Danny Biran, senior vice president, corporate strategy and marketing at Altera. “This will be very exciting for customers as converged silicon continues to be the best solution for complex, high-performance applications.”

About Altera

Altera® programmable solutions enable designers of electronic systems to rapidly and cost effectively innovate, differentiate and win in their markets. Altera offers FPGAs, SoCs, CPLDs, ASICs and complementary technologies, such as power management, to provide high-value solutions to customers worldwide. Follow Altera viaFacebook, Twitter, LinkedIn, Google+ and RSS, andsubscribe to product update emails and newsletters.  altera.com

My Altera will use Intel Custom Foundry’s 14 nm Tri-Gate (FinFET) process services to produce its new high-end SoC FPGA with 64-bit ARM Cortex-A53 IP [‘Experiencing the Cloud’, Nov 1, 2013] post was already answering in detail the following questions that arised from the above announcement:

Why FPGAs? Why more FPGAs?

Why SoC FPGAs?

Why ARM with FPGA on the Intel Tri-Gate (FinFET) process, and why now?

OpenCL for FPGAs

Altera SoC FPGAs

Filed under: consumer computing, consumer devices, smartphones, SoC Tagged: 14nm, 28nm, 4G, 64-bit, 8-core, Altera, ARMv8-A, Cortex-A53, Cortex-A7, finFET, high-volume market, Intel, Krait 450, LTE, MediaTek, MT6590, MT6595, Peter Greenhalgh, Qualcomm, Snapdragon 410, Snapdragon 805, Stratix 10, Tri-Gate, True Octa-core

Show more