2013-07-20

This report consists of the following parts:

The latest MediaTek roadmap

News reports about MT6592 and its first application

What is new vs. my earlier The state of big.LITTLE processing [‘Experiencing the Cloud’, April 7, 2013] report

For the preceding smartphone SoC in the current roadmap see MediaTek MT6589 quad-core Cortex-A7 SoC with HSPA+ and TD-SCDMA is available for Android smartphones and tablets of Q1 delivery [‘Experiencing the cloud’, Dec 12, 2012]. For smartphone SoCs before that  see Boosting the MediaTek MT6575 success story with the MT6577 announcement  – UPDATED with MT6588/83 coming
early 2013
in Q42012 and 8-core MT6599 in 2013 [‘Experiencing the cloud’, June 27, July 27, Sept 11-13, Sept 26, Oct 2, 2012]. Note that MT6588 was renamed MT6589 when was launched, as MT6599 would be renamed MT6592 now.

The latest MediaTek roadmap

Maybank Kim Eng just published in its MediaTek Closing In Fast [July 17, 2013] report the following two SoC roadmaps:



GPU for MT6592 smartphone SoCs (and presumably for MT6588 as well) will be Mali according to Zhu Shangzu (朱尚祖), MediaTek Global Smartphone General Manager in the [Part 2] MediaTek to push 8 small cores, the mystery [ESM 国际电子商情 (International Electronic Business), July 18, 2013] exclusive interview.

According to 28nm Technology [TSMC, June 21, 2011] description: The 28nm technology node of the TSMC foundry (which is used for manufacturing by MediaTek) has a high performance (HP) process as the first option to use high-k metal gate (HKMG) process technology. The 28nm low power with high-k metal gates (HPL) technology, as the second option, adopts the same gate stack as HP technology while meeting more stringent low leakage requirements with a trade of performance speed. Explanation: From about 10 µm to below 0.1 µm (100 nm) conventional silicon oxynitride as the gate insulator with polysilicon gate, so called poly/SiON gate stack, was used for CMOS technology. It was typically possible to scale down to 45 nm, only TSMC was able to scale it down further to 28 nm in which most of the current 28nm SoCs from TSMC are produced. While Intel (and IBM) had to introduce high-K dielectric as the gate insulator with metal gate, so called High-k / Metal Gate stack,  for the performance of their 45 nm products (in order to continue with the Moore’s law in their realm), TMSC could introduce that only on the 28nm node as described above. The HKMG based 28nm SoCs are much higher performance (or lower power) as you could see from the 2GHz clockrate of the MT6592 (above) or MT8315 (below).



For the upcoming MT8135 tablet SoC it is known from the part 3 of the Zhu Shangzu interview that the quad-core configuration will be 2xA15+2xA7, which means a big.LITTLE architecture and quite probably the already mature ‘In Kernel Switcher’ (IKS) scheduler initially. But as ARM already decided on the architecture of the other, more general ‘Global Task Scheduling’ (GTS) solution (see much below) I would assume that the proper hardware underpinnings for GTS will already be built in (unlike in the Samsung’s Exynos 5 SoC released before), so when the scheduler software will be mature enough it will run well on MT8135. The inclusion of just two cores of each (unlike in Exynos 5) is a very strong proof-point of that. As far as the GPU is concerned we know from Zhu Shangzu interview that an Imagination GPU will be used, therefore I will leave the next-generation SGX6XX (PowerVR Series6 or ‘Rogue’) indication in the above table. 

with the following commentary:

Strong fundamentals intact. Having exceeded its 2Q13 guidance so significantly, we believe MTK will continue to ride the strong momentum in 3Q13, perhaps growing its revenue by low-to-mid-teens QoQ or 30% YoY to chalk up another record high of TWD36-38b [US$1.2-1.27B]. Importantly, a better product mix and cost structure would help lift its profitability to ±44%. We expect MTK to ship 70-72m units of smart devices, up 25-30% QoQ, with quad-core APs and tablets making up nearly 50% of total shipment. The benefits of operating leverage should drive OPM past 20%, the highest since 3Q10. MTK is set to report its 2Q13 results in late July or early August and we forecast net profit of TWD6.8b [US$227M] (EPS: TWD5.02; Street: TWD6.3b), up over 80% QoQ and 100% YoY. GM is also likely to meet the high end of its guidance, ie, 43.5%, on richer mix and improved cost structure. Reported revenue of TWD33.3b, up almost 40% QoQ and 42%YoY, is already well ahead of guidance (TWD30-32b). However, we cut our FY13/14 earnings forecasts by 3% each to factor in the delay in merger with MStar and potential inventory correction in 4Q13/1Q14. MTK remains a key BUY in our tech space.

Closing in fast on QCOM. MTK has spared no efforts to enhance its smart device portfolio since 2H12 and further signs of acceleration are evident. It is introducing two high-end APs in 4Q13 – MT6588 and MT6592 – using 28nm HKMG and advanced graphic features. While the former is a quad-core AP operating at 1.7GHz, the latter is capable of running at 2GHz (when all eight core engines are turned on). In the absence of full details, we estimate MT6592 may perform closer to Qualcomm Snapdragon 600 AP (used in Galaxy S4 and HTC One), while MT6588 should outshine Snapdragon 400. MTK has won several international OEMs with MT6589 and with MT6588/6592, its chances of penetrating tier-1 OEMs have increased significantly. In addition, it will sample its high-end 4G/LTE/LTE-TDSCDMA modem chipset in anticipation of the launch of 4G network in China later this year. As for tablets, MTK’s latest APs MT8125/8389 were well-received and it is set to deliver the high-end MT8135 (big.Little design) in 3Q13. We expect its smartphone/tablet shipments to reach 200-225m/25m units in 2013.

In the same part of the interview Zhu Shangzu explained MediaTek’s high-end strategy as follows:



… I think the future of high-end smartphones innovation will focus on the expansion of big screen multimedia applications, and this is our direction. …

Judging from the current situation, customers of high-end flagship phones are still using the products of the competitors, but there is flagship in our quad-core case as well, and OPPO, Vivo and GiONEE and other quad-core phones are also very popular. Our next goal is to get the customers of flagship machines using our platform via helping customers to achieve stronger performance on the big screen multimedia.

Therefore, the 8-core MT6592 can be regarded as our first bugle call for moving towards the high-end market. Our mission is that one day customers can also recognize MediaTek as doing high-end flagship products. MT6592 is the first step, strictly speaking, it is not the most high-end platform, next we will move step by step towards the higher end.

Q: Why will MediaTek use eight small A7 cores as a generation of high-end platform, but did not choose to use four large A15 cores or four big and four small ones as a way to achieve the goal? This is also a question for the industry as there are many controversial issues with this.  

For power, or performance per watt, we did a lot of investigation. Eight A7 cores is currently the best solution, and as through a process we designed to boost peak frequency of the A7 to 1.9-2Ghz, performance is also very strong.

Currently we chose a small core, because under the existing process, the larger the chip die size, the larger is the standby leakage, resulting in higher standby power consumption. For example, the A15 is the strongest core currently, but not in run-time power cosumption. Even if its frequency is pushed down to very low levels, there is still a larger leakage. Therefore, the larger is the area of a ​​single-core, the larger is the overhead energy efficiency, and as long as the poweris on, there will be a greater leakage.

In addition, the 8-core CPU is just one aspect of improving the mobile multimedia experience. In fact, as we have been doing MediaTek digital TV for a long time, we will extend that digital TV competency here – some strong move for the smartphones. This is what other platform vendors can not do. In the 6592, for example, the latest HEVC codec will be integrated. [HEVC is a video compression standard, a successor to H.264/MPEG-4 AVC]



Although our MT6592 GPU is also using a ‘Deluxe’ Mali quad-core GPU, but in order for content developers to achieve better compatibility, our HEVC is a software solution via the 8-core CPU, it is not using a GPU- based software solution. Because there are some strong content developers who will use their own HEVC decode. Currently the ‘Deluxe’ quad-core GPU on 6592 is mainly used to perform large-scale games and to do some advanced UI.

How to plan the future in the tablet market?

Q: I do note that the MT6592 is now using a quad-core Mali GPU, while before the MediaTek mainstream used Imagination GPU. How would you rate these two companies’ products?

The Imagination company has been doing GPUs long time in its history, the architecture design is beautiful, more artistic. The initial architecture of Mali [from ARM] would be more rough, and therefore area and power consumption will be worse. But after nearly three years of time, Mali has made a lot of progress, both are learning from each other, and by now the levels of these two are equal. The future perspective is that ARM’s overall resources are somewhat more fully available.

Q: This year we have seen MediaTek  to attack the tablet market, what is the plan for the future in the tablet market?

A: Our current strategy is to carry out a mobile phone product line extension.

At the end of July the launch of a tablet chip is expected: the MT8135, with 2xA15 +2xA7, still using an Imagination GPU, and mainly targeting the high-end tablet market. A small reminder, our MT6572 is not suitable for tablet computers as the original definition did not take into account the application of flat-screen.



News reports about MT6592 and its first application

July 18 this information appeared on the English http://en.v5zn.com/ website of the related smartphone vendor as well: MediaTek MT6592′s first eight-core mobile phone exposure makes you believe [July 15, 2013] as translated by Google and Bing with manual edits

MediaTek so-called true eight-core processor MT6592 was announced not long ago, it is expected the first models equipped with processors to surface. It broke the news, that the domestic mobile phone manufacturer brand named after the 19th-century French writer Jules Verne [凡尔纳] has been determined to launch a flagship model “V8″ quipped with the MT6592 processor.

Verne’s current main product is the “V5″ model, equipped with a quad-core MediaTek MT6589, and a 5-inch 720p OGS full lamination screen, 1GB of RAM, 4GB storage, 8-megapixel back-illuminated camera, 2400 mAh Battery, with a list price of 999 yuan [$166].

V8 has not yet announced the exact configuration bit it is estimated to have about 5.5 inch 1080p screen, 2GB RAM, 32GB storage, 13 million pixels Sony stacked camera, higher capacity battery, etc., without these natural shot himself embarrassed flagship.

It looks like that cooperation between MediaTek and the domestic Shanzhai vendors remains close. As MT6589 has rocked the Main Street, MT6592 will soon become a standard, and “an eight-core” promotion will be overwhelming.

Incidentally recap: MT6592 uses eight Cortex-A7 architecture cores, clocked at up to 2.0GHz, with TSMC 28nm manufacturing, Antutu run is known as close to 30,000, but the graphics core has not been confirmed, PowerVR SGX 544MP4/554MP4 are likely [it will be Mali, as communicated by MediaTek, see above].

The marketing of the processor has begun to customers, but mass production will be in November, so if recent high profile publicity is to be fulfilled, certainly we will have a large sale early next year.

Company introduction [Jules Verne mobile phone, January 16, 2013] as translated by Google and Bing with manual edits

Shenzhen MINDRAY Platinum Communication Technology Ltd. is is specialized in products development, production, sales and service of intelligent mobile terminals of high-tech companies. Under the “Jules Verne VOWNEY” brand the company is to create a mobile intelligent terminal brand.

MINDRAY Platinum company with “intelligent life” as the brand mission, is to “enhance the user experience, to help people grasp the development opportunities” as the goal, trying to make Jules Verne a trustworthy, continuous innovation and smart moves life guide. Every effort, just as long as you!

Jules Verne mobile phone network direct sales, stripping agents layers, increases direct benefits to consumers. We are committed to allow more consumers to have a better quality of life with an intelligent terminal.

The “Jules Verne VOWNEY ” brand aspires to be able to improve the quality of life for mobile users intelligent terminal INITIATIVE persons.

is to become quality of life can improve the user moves Smart The Terminal Guide. Lead you into “Slide 5.0″.

“Verne VOWNEY “brand aspires to be able to improve the quality of life for mobile users intelligent terminal INITIATIVE persons. I lead you into the “Slide 5.0″era.

Brand interpretation

Jules Verne: a derivative of intelligent life???

English explanation : VOWNEY
V : value— Value
O : opportunity— Opportunity
W : worth— It is worth
N : new— New
E : e— Mobile Internet
Y : you— You

Jules Verne is to ” create a new life guided smart” as the goal, and strive to become a trusted, sustainable and innovative mobile phone brand, all efforts, just because of you!

Mediatek MT6592 8 core processors coming by the end of July! [Gizchina.com]

Reports out of Taiwan state that Mediatek will launch the MT6592 8-core processor by the end of July.

There was word that Mediatek were working on an 8 core chipset late last year, but like many we believed it had been placed on the back burner while they prepared their LTE chip. This seems to be wrong though as sources in Taiwan claim that Mediatek’s 8-core processor will arrive before the end of this month!

The MT6592 chip will be made up of 8 Cortex-A7, 28nm processor clocked at a frequency of up to 2Ghz! Early tests have the 8 core MT6592 scoring up to 30,000 points in Antutu which is more than Samsung’s 8 core Exynos 5410 processor.

The first batch of these new processors will be ready for manufacturers to begin development by the end of July, while Mediatek are preparing full-scale manufacture for November!

If everything goes to plan we can expect powerful 8 core phones from Tier 1 Chinese phone manufacturers by December!

MediaTek to launch true 8-core, 2GHz MT6592 chipset in November? [Engadget, July 2, 2013]

Samsung may already have its 8-core Exynos 5 Octa offering, but the original “big.LITTLE” implementation means only up to four cores work together at any time — either the Cortex-A15 quartet or its lesser Cortex-A7 counterpart. In other words, we’d rather rename the chipset range to something like “Exynos 5 Quad Dual.” But according to recent intel coming from Taipei and Shenzhen, it looks like Taiwan’s MediaTek is well on its way to ship a true 8-core mobile chipset in Q4 this year.

The first mention of this 2GHz, Cortex-A7 MT6592 chip came from UDN earlier today. The Taiwanese publication claims MediaTek started introducing its first octa-core product to clients last week, and it’s expected to enter mass production using TSMC’s 28nm process in November. The first mobile devices to carry this hot piece of silicon may hit the market in early 2014 — hopefully just in time for the Chinese New Year shopping rush.

UDN adds that the MT6592 scored close to 30,000 on AnTuTu, which is pretty high but still some distance behind Qualcomm’s 2.2GHz quad-core Snapdragon 800. Of course, chances are MediaTek’s offering will be much cheaper, as evidenced by all the affordable MediaTek-powered devices in China these days.

In a separate article from last week, UDN pointed out that judging by over a hundred job openings released by MediaTek last month, the company is clearly putting an emphasis on 4G LTE technology, alongside GPU and Android development. The publication also quoted chairman Tsai Ming-kai saying he will launch an LTE solution in Q4 this year, by which point MediaTek will only be one or two years behind its competitors.

The second piece of info came from HQ Research analyst Pan Jiutang, who posted an alleged spy shot of MediaTek’s upcoming roadmap (pictured left). There the octa-core MT6592 is listed with a clock speed of 1.7GHz to 2GHz, along with 1080p 30fps video decoding support. There’s also a quad-core 1.7GHz MT6588 accompanying its octa-core sibling in the same period on the timeline, though it appears to be just a faster version of the current 1.2GHz MT6589.

For the sake of phone manufacturers, both new chipsets will apparently be pin-to-pin compatible with the quad-core 1.3GHz MT6582 due Q3 this year, thus lowering R&D costs. Better yet, the roadmap also states that the MT6290 LTE modem — as teased by Tsai above — will be compatible with these three chipsets.

With MediaTek quickly catching up ahead of China’s eventual TD-LTE launch, Qualcomm will need to tread carefully to keep its Chinese QRD partners happy.

[Thanks, Ryan!]

Update: It’s worth noting that ARM’s eventual “big.LITTLE MP” implementation will allow all eight cores to run simultaneously, but the Exynos 5 Octa currently doesn’t support this. Thanks, UncleAlbert!

SOURCE: Sina Weibo (login required), UDN (1), (2)

What is new vs. my earlier
The state of big.LITTLE processing [‘Experiencing the Cloud’, April 7, 2013] report

Power scheduler design proposal [by Morten Rasmussen from ARM on Linux kernel mailing list, July 9, 2013]

This patch set is an initial prototype aiming at the overall power-aware scheduler design proposal that I previously described <http://permalink.gmane.org/gmane.linux.kernel/1508480>.

The patch set introduces a cpu capacity managing ‘power scheduler’ which lives by the side of the existing (process) scheduler. Its role is to monitor the system load and decide which cpus that should be available to the process scheduler. Long term the power scheduler is intended to replace the currently distributed uncoordinated power management policies and will interface a unified platform specific power driver obtain power topology information and handle idle and P-states. The power driver interface should be made flexible enough to support multiple platforms including Intel and ARM.

This prototype supports very simple task packing and adds cpufreq wrapper governor that allows the power scheduler to drive P-state selection. The prototype policy is absolutely untuned, but this will be addressed in the future. Scalability improvements, such as avoid iterating over all cpus, will also be addressed in the future.

Thanks,

Morten

From <http://permalink.gmane.org/gmane.linux.kernel/1508480>





Linux Kernel News – June 2013 [by Shuah Khan in Linux Journal , July 9, 2013]

As always the Linux kernel community has been busy moving the Linux mainline to another finish line and the stable and extended releases to the next bump in their revisions to fix security and bug fixes. It is a steady and methodical evolution process which is intriguing to follow. Here is my take on the happenings in the Linux kernel world during June 2013.

Mainline Release (Linus’s tree) News

Linus Torvalds released Linux 3.10. You can read what Linus Torvalds had to say about this release in his release announcement athttp://lkml.indiana.edu/hypermail/linux/kernel/1306.3/04336.html

Two notable features in this release are improved SSD caching and better Radeon graphics driver Power Management.



Power efficient scheduling design

Ingo Molnar (Red Hat, x86 maintainer), Morten Rasmussen (ARM, power mgmt.), Priti Murthy (IBM, scheduler), Rafael Wysocki (Intel, Linux PM, and Linux ACPI maintainer) and Arjan van de Ven discussed the proposed power-aware or power-efficient scheduler design and what’s the best way to integrate it into the kernel.

Power management and the ability to balance performance and power efficiency is important and complex. It is not just about scheduler or cpus. It spans I/O devices that transition into lower-power states and how costly it is to bring them back to fully active state when needed. There is latency involved in these transitions. As always, Linux developers reach consensus to solve complex problems such as these and come up with path to get to the goal taking small steps towards that goal. Here is another example of that process at work.

Power-efficient scheduler work has been active for a few months now. Several RFC patches have been floated and discussed. This work is being pursued very actively in x86 space by IBM and in ARM space by ARM. The premise is that, if scheduler could pack tasks on a few cores and keep these cores fully utilized and, transition other cores to low power states, when the scheduling goal is power savings over performance. In other words, instead of keeping all the cores active, scheduler could consolidate tasks on a few cores and transition other cores to low-power states for better power efficiency.

It is easier said than done. Scheduler is at a higher level and would not be the best judge of making decisions on transitioning CPUs to idle states and deciding on the ideal frequency they should be running at. These decisions are better left to platform drivers that have the specific knowledge of the platform and architecture as they are complex and very hardware specific. In other words, power aware scheduler tuned to run well on x86 platforms will not work as well or could fail miserably on ARM platforms.

Scheduler has to accomplish load balancing as well as power balancing in a way to meet performance and power goals and do it well on all platforms. A generic scheduler doesn’t have to control and drive low-power state decisions on a platform. However, the goal of power-efficient scheduler is to set higher level abstracted policies that would work on all platforms. After a long and productive discussion, there is a consensus and here is the summary:

A new kernel configuration option CONFIG_SCHED_POWER to enable/disable the power scheduler feature. Power scheduler is totally inactive, when CONFIG_SCHED_POWER is disabled, and fully active when CONFIG_SCHED_POWER is enabled. The important goal is evolving the power scheduler feature without disrupting and destabilizing the current scheduler.

Work on a generic power scheduler with hardware and platform abstractions that will work well on big little ARM, x86, and other platforms. Avoid platform specific power policies that could lead to duplication of functionality in platform specific power drivers.

Please check the Linux Foundation site for presentations made at the Linux Collaboration Summit back in April 2013 on this topic. Here is the link to Jonathan Corbet’s blog on this topic.
http://www.linux.com/news/featured-blogs/200-libby-clark/715486-boosting…



From: big.LITTLE Software Update [by George Grey on Linaro Blog, July 10, 2013]

There are also two software models now available, that ARM and Linaro have developed to enable control of workloads, performance, and power management on big.LITTLE SoCs.

The first is the IKS [In Kernel Switcher, also known as CPU Migration]software, developed by Linaro, that treats each pair of Cortex-A7 and Cortex-A15 cores as a single ‘virtual’ core. On a multicore SoC each pair is treated as 1 of n virtual symmetric cores by the Linux kernel.

Core Software Configuration for IKS (4+4)

Using existing mechanisms in the Linux kernel for each pair the cpufreq driver controls whether the Cortex-A7 is active (for low power) or the Cortex-A15 is active (for maximum performance). Overall maximum performance and throughput on a 4+4 core SoC is from 4 Cortex-A15s. The key attribute of IKS is that it relies on existing well-understood mechanisms in the Linux kernel and it is easy to implement, test and characterize in a production environment.

The second is the Global Task Scheduling (GTS) [also known as big.LITTLE MP] software developed (and now named) by ARM. This is known in Linaro as big.LITTLE MP. Using GTS all of the big and LITTLE cores are available to the Linux kernel for scheduling tasks. We are very proud that Linaro has contributed to ARM’s development of the GTS software, and that it is now publicly available in Linaro builds. ARM and Linaro recommend GTS for new products, and Linaro members are actively planning product deployments using this solution.

Core Software Configuration for GTS (4+4)



The big.LITTLE MP patch set creates a list of Cortex-A15 and Cortex-A7 cores that is used to pick the target core for a particular task. Then, using runnable load average statistics, the Linux scheduler is modified to track the average load of each task, and to migrate tasks to the best core. High intensity tasks are migrated to the Cortex-A15 core(s) and are also marked as high intensity tasks for more efficient future allocations. Low intensity tasks remain resident on the Cortex-A7 core(s).

IKS and GTS are now publicly available in Linaro monthly engineering releases for the ARM TC2 Versatile Express hardware, and in Linaro’s interim Long Term Supported Kernel (LSK) build. Both will also be incorporated into the first full Linaro LSK, which will be based on the next Linux Foundation, Greg Kroah-Hartman designated, Long Term Supported (LTS) kernel.



Until GTS functionality is fully upstream, ARM is supporting the big.LITTLE MP patch set for its licensees, leveraging Linaro’s public monthly and Linaro LSK builds, so that it is available to all ARM licensees for product integration and deployment. Linaro also expect to provide a topic branch for the latest work available on the upstream GTS implementation for interested developers.

… ARM and Linaro now recommend product development and deployment to be based on the GTS solution. However, there are some cases where hardware limitations or a requirement for the traditional Linux scheduler (for example in some embedded applications) may lead to IKS still being required.

Future Work

Power management software in Linaro is worked on by the Power Management Working Group. Other activities within the Group will enable additional power savings on ARM multi-core devices. One current project worth highlighting is the work being done by Vincent Guittot on small-task packing. Normally the Linux kernel will spread running tasks over all the available CPU cores. On a handset in standby, or even when being used with low activity, there may be a number of housekeeping and other small tasks that run in the background or relatively infrequently and therefore keep cores active unnecessarily. If “small” tasks can be migrated to one core, then the other cores could be made idle or even turned off completely, potentially resulting in significant power savings. This feature is expected to offer improved power management to systems based on symmetric multi-core SoCs (for example dual or quad-core Cortex-A7 or Cortex-A15 parts), as well as big.LITTLE SoCs.

While the current big.LITTLE efforts are focused on Cortex-A15 and Cortex-A7, the techniques being implemented today for 32-bit systems are already being run on 64-bit models. We therefore expect to see the GTS software running on 64-bit Cortex-A57 and Cortex-A53 based big.LITTLE SoCs as soon as they become available.

Real Life Results

ARM has published further information on big.LITTLE configurations and performance in a blog entry here [Ten Things to Know About big.LITTLE [Brian Jeff on SoC Design blog of ARM, June 18, 2013]].

The first commercial products based on big.LITTLE are certain international versions of the latest Galaxy S4 phone from Linaro member, Samsung. Samsung-LSI provide an ‘Octa-core’ 4+4 big.LITTLE chip for this phone. As has been publicly noted, the current generation of hardware cannot yet take full advantage of the IKS or the GTS designs because the hardware power-saving core switching feature is implemented on a cluster basis rather than on a per-core or a per-pair basis. Even so, the first big.LITTLE implementation produces performance and power consumption on a par with the latest Qualcomm multi-core Snapdragon processor according to reviews from Engadget, PocketNow and others. Often first implementations of new technology never see the light of day – it is a tribute to Samsung’s engineers that the Exynos 5 is already seeing the Cortex-A15 level of performance with the power saving of the Cortex-A7s in a mass market handset in the very first big.LITTLE iteration.

We look forward to seeing what improvements full use of GTS will bring when used on future production devices from Samsung and others.

More information: Power Management with big.LITTLE: A technical overview [by Steven Willis in SoC blog of ARM, June 20, 2013]

Why all this sudden attention on the Linux Scheduler? [LCE13, Linaro Connect Europe]

12:00 PM – 13:00 PM on Monday, Jul 8, 2013 (IST)

Description

The Linux scheduler is getting a lot of attention in the ARM ecosystem these days. Come to this discussion to find out why.

Several people working on the scheduler or interested in changes to the scheduler will be invited to talk about their requirements, what is the state of their work, who will benefit from it, etc.

Video record of the Why all this sudden attention on the Linux Scheduler? dscussion

Minutes of the above discussion

Determinism: problems
———————

* Preemption: interrupts, locking
* Latency
* Scheduling overhead
* Realtime processing

Most of the requirements are coming from LEG/LNG.

Solutions:
    – PREEMPT_RT
    – Adaptive NO_HZ (merged in 3.10)

        Came out of high-performance computing. When there is just one
        task, the scheduler is switched off for that CPU. Results in
        zero scheduler overhead. When the only task finishes – the CPU
        will get into scheduling/idle again.

        There is still once-per-second tick for scheduling. There
        is a patch removing that last remaining bit to make it fully
        tickless.

        We’re not sure yet if all the possible limitations are found -
        there still might be some scheduler overhead left.
        If interrupt handling is offloaded to other cores, caching
        related issues will still affect performance (e.g. serving IO
        interrupts for the task on a different core will require the
        dedicated core to cache the date once again).

    – Deadline

Physical process isolation: none addresses

    – Needed for KVM.

Temporal isolation: all three (with some limitations)
No scheduling overhead: ADAPTIVE NO_HZ only.
Firm/Hard Real-time PREEMPT_RT only
Complexity:
    high for PREEMPT_RT
    low for the rest

Requirements:
all of the above

Power efficiency: history
————————-
* sched-mc (got removed)
* big.LITTLE MP patches implementing GTS (ARM)
* Packing Small Tasks (Linaro/ARM)
    Pack all small background tasks on as little number of small cores
    as possible to conserve power.

    Intel approach does not care about which core is selected as the
    best one (Turbo Mode is effectively converting the core into a BIG
    core, while all the other cores are becoming little ones). Task
    migration is expensive – this approach helps avoiding it.

* Power aware scheduling (Intel)

    Discussions were lasting for a while and then Ingo Molnar requested
    an integral solution (not a set of independent bits).

    He made a good point. What we have an SMP legacy implementation.

    Are we starting from scratch because of that?

    It is going to be a significal change. We need to re-think as it’s
    not SMP case anymore. b.L is not a new architecture – Intel already
    does that but differently.

    The task is to find the most efficient way of performing the work
    needed. The best place to make those decisions is the scheduler.

    Power officiency – proposal (from ARM)
    ————————————–

    Separate process and power scheduler (ARM). This is the first step
    to get to the fully integral scheduler in the future. Helps fighting
    with the complexity at hand. In this case there are certain
    limitations – one of the schedulers will be leading while the second
    one will be limited.

    That doesn’t work well for Intel CPUs (no pre-configured small/BIG
    cores).

    Issues:

    – Topology
        Missing:
        – Frequency domains, which CPUs are affected. That would be
          useful for the scheduler.

    – Idle + DVFS
        Missing:
        – information about the cost of using a certain core at certain
          DVFS operation point to perform a certain amount of work.

    – Thermal
        The idea is to keep an eye on the temperature trend to avoid
        cases when whole cores are needed to be temporarily shut down to
        cool them down.

        GPU contribution into the thermal budget should also be
        considered.

    Trying to control DVFS from the scheduler. Patches are expected very
    soon.

Q: How much of the improvements are we looking for (power wise)?
A: Something that will get upstream.

Linux 3.10 [by Linus Torvalds on Linux kernel mailing list, June 30, 2013]
Linux kernel 3.10 arrives with ARM big.LITTLE support [Engadget, July 1, 2013]

Thanks to Linus Torvalds’ figurative stroke of the pen, the Linux kernel 3.10 is now final — paving the way for its inclusion in a bevy of Linux distributions, and even offshoots such as Android and Chrome OS. The fresh kernel brings a good number of changes, such as timerless multitasking, a new caching implementation and support for the ARM big.LITTLE architecture. In simplistic terms, the new multitasking method should help improve performance and latency by firing the system timer only once per second — rather than 1,000 times — when tasks are running. Meanwhile, users with both traditional hard drives and SSDs will find performance benefits from bcache, which brings writeback caching and a filesystem agnostic approach to leveraging the SSD for caching operations. Also of significance, Linux kernel 3.10 enhances ARM supportby including the big.LITTLE architecture, which combines multiple cores of different types — commonly the Cortex-A7 and Cortex-A15 — that focus on either power savings or performance. The full list of improvements is rather lengthy, but if you feel like nerding out with the changelog, just grab a caffeinated beverage and get to it.

Linaro 13.06 Released! [by Amber Graner on Linaro Blog, June 27, 2013]

The Linaro 13.06 release is now available for download!



It’s been a very active cycle for the Builds and Baselines team, reporting that the Continuous Integration (CI) loop for the Linaro Stable Kernel (LSK) Android proof of concept which is based on 3.9.6 kernel version was set up and includes the big.LITTLE IKS and MP patches (also called beta patchset). Support for Kernel CI loop with Android filesystem was added to android-build and CI loop was set up to track the ARM Landing Team (LT) integration tree. The HiSilicon member build with complete CI loop was set up and now tracks the LT kernel tree.



Filed under: notebooks, smartphones, SoC, tablets Tagged: 28nm HKMG, Android, ARM, ARM big.LITTLE support, big.LITTLE, big.LITTLE architecture, big.LITTLE MP, big.LITTLE MP patch set, big.LITTLE Processing, Cortex A15, Cortex-A7, CPU Migration, Digital TV, eight-core, Gionee, Global Task Scheduling, GTS, HEVC codec, high-end tablet, IKS, Imagination GPU, In Kernel Switcher, Jules Verne VOWNEY brand, Linaro, Linaro 13.06, Linux 3.10, Linux kernel, Linux scheduler, Mali, MediaTek, MediaTek roadmap, Morten Rasmussen, MT6592, MT8135, octa-core, Oppo, Power efficient scheduling, power management with big.LITTLE, power scheduler for Linux kernel, Qualcomm, Samsung, sceduler, Shenzhen MINDRAY Platinum Communication Technology Ltd., small-task packing, smartphone SoC, superphones, tablet SoC, Verne company, Verne V8, Vivo, Zhu Shangzu, 凡尔纳, 朱尚祖

Show more