For more information, join the team subscribe to the mailing list 
at the bottom of the Launchpad page

    http://launchpad.net/~hybrid-graphics-linux 

Please join this team if you are new by clicking on the "Join 
Team" link at the right of the Launchpad page. It's important to 
have as many users in the community as possible to request for 
appropriate support.

Friday, 26 March 2010

Google Summer of Code 2010: Cool X.Org projects

X.Org Wiki - SummerOfCodeIdeas
Open Source PRIME multi-gpu support

This SoC project involves taking the nvidia optimus technology and implementing the same technology in open source driver stack. Some work has already been done to flesh this out in http://airlied.livejournal.com/71734.html there needs to be more work done in integrating it into the kernel and X.org stacks along with some client side method of picking which apps should be started where. Also the possibility of rendering 2D apps could be investigated.

Possible mentor: DavidAirlie
X.Org Wiki - SummerOfCodeIdeas
Gallium H.264 decoding

Write a VDPAU state tracker for Gallium. In Gallium, state trackers implement APIs, and generate card-independent shaders. This allows supporting multiple cards with a single piece of code. As part of Summerof Code 2008, a student successfully implemented g3dvl, a video layer for decoding XvMC on top of Gallium. This project separated the video decoding code into two parts: a common vl part, and an XvMC frontend. The result allowed hardware accelerated MPEG2 playback on all Gallium-supported hardware. The purpose of this project is to build upon this code and add H.264 and VDPAU support. This requires improving the vl code for features that differ between H.262 and H.264, and adding a new VDPAU frontend. Furthermore, not all H.264 features can be implemented using shaders (like for example CABAC), so integrating CPU-based and GPU-based acceleration stages together will be another challenge. Ultimately, the purpose of this project is the production of a fully open source chain for hardware video decoding on any Gallium-supported GPU.

Possible mentor: StephaneMarchesin

Tuesday, 23 March 2010

Google Summer of Code 2010 -- Open Source PRIME multi-gpu support

The would-be student application discussion period for the Google Summer of Code 2010 has started:
http://socghop.appspot.com/document/show/gsoc_program/google/gsoc2010/faqs#timeline
Potential students can now browse through the different proposals and choose the coolest sexiest project they want to spend their summer working on. The coolest sexiest project of all in the Hybrid Graphics Linux field is obviously the one David Airlie will be mentoring within the X.org:
http://wiki.x.org/wiki/SummerOfCodeIdeas

Open Source PRIME multi-gpu support


This
SoC project involves taking the nvidia optimus technology and
implementing the same technology in open source driver stack. Some work
has already been done to flesh this out in http://airlied.livejournal.com/71734.html there needs to be more work done in integrating it into the kernel and
X.org stacks along with some client side method of picking which apps
should be started where. Also the possibility of rendering 2D apps could be investigated.

Mentor: DavidAirlie

As it's stated in the description, David Airlie has already done a bit of work in this area, so it's now ripe time for an enthusiastic student to apply for the project and spend time experimenting with Linux/Xorg code to implement cool new Linux features for the newest line of GPU hardware on the market. All this while earning good money from Google! Who said open-source software doesn't pay???!?!

If you are an enthusiastic student interested in hybrid graphics features in Linux, contact David now!
If you are not a student but know someone who would be interested, spread the word :-)
If you are none of the above, but would like this project to happen, write your encouraging words as a comment to this post!

Sunday, 21 March 2010

Nvidia Optimus + Linux + Google Summer of Code 2010: Open Source PRIME multi-gpu support features

We are looking for students with basic Linux kernel/graphics notions
interested in applying for the X.org Open Source PRIME multi-gpu
support Google Summer of Code 2010:

http://wiki.x.org/wiki/SummerOfCodeIdeas

We are looking for Linux users with Nvidia Optimus-enabled laptops
willing to provide debugging information for Open Source PRIME
multi-gpu support features being worked on. Please join the team and
send an email to the mailing list specifying your laptop model.


You can check the model, version and graphic card details of you laptop with this command:

sudo dmidecode -s system-product-name

sudo dmidecode -s system-version

lspci -vnnn | perl -lne 'print if /^\d+\:.+(\[\S+\:\S+\])/' | grep VGA

https://launchpad.net/~hybrid-graphics-linux

Friday, 12 March 2010

NVIDIA Optimus + Linux = PRIME @ www.phoronix.com

It seems that NVIDIA Optimus support for Linux is picking up pace and David Airlie has started to work on some proof-of-concept code for Multi-GPU Rendering in Linux. Since Phoronix posted first about it, we are here linking to their post, and David's original post afterwards:

[Phoronix] Proof Of Concept: Open-Source Multi-GPU Rendering!
Now that David Airlie's vga_switcheroo has went upstream in the Linux 2.6.34 kernel that provides hybrid graphics support and delayed GPU switching, David went on to look for something new to work on in his downtime when not busy with tasks at Red Hat. This new work is on GPU offloading / multi-GPU rendering.

Last month NVIDIA introduced Optimus as a way for dual-GPU notebooks to seamlessly switch between the two GPUs but also to offload the rendering workload to the other graphics processor. This is somewhat similar to NVIDIA's SLI and ATI/AMD's CrossFire for splitting the rendering workload across multiple GPUs, but it has its differences. David ended up developing a proof-of-concept similar to NVIDIA's Optimus that he is calling "Prime" and it works with Intel and ATI GPUs.

David's goals with Prime are to allow a second GPU to render 3D applications onto the screen of the first GPU, with it being configurable by the client, and just to handle the rendering side. This work isn't as simple as his vga_switcheroo implementation, but it required changes to the Linux kernel and the Graphics Execution Manager (GEM), the DRI2 protocol, the X Server and DRI2 modules, and then the actual Linux hardware drivers.

All of this code has already been published as a proof-of-concept, but David shares on his blog that he's unlikely to personally take this work further by upstreaming the code. He has been successful though in using this code to offload the rendering work from an Intel IGP that's driving a display to a discrete ATI graphics processor.

Right now Intel and ATI hardware is supported, but NVIDIA GPUs could be supported too. This work depends upon a system using DRI2 (albeit with these out-of-tree patches) and a compositing manager must be running. David also shares, "To make this as good as Windows we need to seriously re-architect the X server + drivers. At the moment you can't load an X driver without having a screen to attach it to, I don't really want a screen for the slave driver, however I still have to have one all setup and doing nothing and hopefully not getting in the way. We'd need to separate screen + drivers a lot better. Having some sort of dynamic screens would probably fall out of this work if someone decides to actually do it."

It would be wonderful if this work on Prime could be continued and it works its way upstream or that someone takes the reigns from David to continue on with this GPU offloading work for open-source drivers. First though it may make more sense to focus on getting decent performance out of a single GPU before dealing with multi-GPU excitement.


airlied: GPU offloading - PRIME - proof of concept
THIS IS A PROOF OF CONCEPT - its not
going to be upstream unless someone else dedicates their life to it,
(btw anyone know anyone in ASUS?)

So NVIDIA unveiled their
optimus GPU selection solution for Windows 7, so I decided to see what
it would take to implement something similar under DRI. I've named it
PRIME for obvious reasons.

Goals:
1. Allow a second GPU to render 3D apps onto the screen of the first, pickable from the client side.
2.
Just target the rendering side, I'm assuming the GPU power up/down is
similiar to what was done for the older switching method.

Restrictions + limitations:
1. Must have compositing manager running
2. Must have second screen configured for slave card (doesn't need to be used)

Test system:
Intel 945 IGP + radeon r200 PCI card - yes this won't be a speed demon.

Terms:
Master: the IGP displaying the output - intel
Slave: the GPU rendering the app - radeon r200 in this case.

Step 1: kernel support

http://git.kernel.org/?p=linux/kernel/git/airlied/drm-testing.git;a=shortlog;h=refs/heads/drm-prime-test
http://cgit.freedesktop.org/~airlied/drm/log/?h=prime-test

The kernel requirements were simple, we needed a way to share a memory managed object between two kernel device drivers.
The
kernel has a GEM namespace per device, however this isn't good enough
to share with other devices, so I introduced a new PRIME namespace with
two ioctls. One ioctl allows the master device to associate a device
buffer handle with a name in the prime namespace, and the other allows
the slave device to associate a prime namespace handle with a buffer.
When the master creates a prime buffer the kernel associates the list
of pages with the handle, and when the slave looks up the same handle
it retrieves the list of pages and fakes up a TTM buffer populated with
those pages as backing store. I've added the concept of slave object to
TTM to allow for this.

The drm repo contains the API wrappers + intel + radeon pieces to call the association functions for buffer objects.

Step two: DRI2 Protocol
http://people.freedesktop.org/~airlied/prime/0001-dri2proto-add-prime-token.patch
http://people.freedesktop.org/~airlied/prime/0001-prime-support-for-mesa.patch

From
the X server point of view a recent change to the DRI2 layer allowed
for multiple device driver names to be associated with a DRI2 end
point. The client can request either a DRI or VDPAU device name
currently. I firstly extended the DRI2 protocol, to add a new buffer
type, called PRIME, and added a hack to mesa's glx loader to request
the prime driver if an environment variable was specified.

Step 3: X server DRI2 module + drivers

http://people.freedesktop.org/~airlied/prime/0001-intel-add-prime-master-support.patch
http://cgit.freedesktop.org/~airlied/xf86-video-ati/log/?h=prime-test
http://people.freedesktop.org/~airlied/prime/0001-dri2-prime-hackfest.patch


This
was the messiest bit and still requires a lot of change. First up I
added an interface for the drivers to register as PRIME master and
slaves. Intel driver registers as master, radeon as slave for my demo.
We store these in an array. When a client connects and requests prime
driver, we mark the drawable and redirect the dri2 buffer creation
requests to the slave screen driver. Also the drm authentication is
sent to both kernel drms. It then hooks the swapbuffers command where
it does a region copy, and redirects this to the slave driver, and
damages the pixmap in the master driver. Now the "interesting" part, my
original implementation simply grabbed the window pixmap at the dri2
create buffers time, however there is an ordering issue with
compositing, this pixmap is pre-composite redirection so isn't actually
the pixmap you want to tell the kernel to bind to both gpus. This
turned out to function badly, I could see gears all stretched over the
front buffer.

So a quick coke + chocolate break later, I had
enough sugar to bash out the hack that now exists. DRI2 calls the slave
driver copy region callback, which checks if the drawable pixmap is on
the same screen, if its not, it checks if we've marked the pixmap as a
prime pixmap (i.e. one that belongs to the master). It is, it swaps in
the slaves copy, otherwise it callsback into DRI2. This callback calls
the Intel driver to make the buffer object backing the pixmap,
shareable, and returns the handle,then calls into radeon with the
handle to create a new pixmap pointing at the shared buffer object.
Once all that is done, radeon copies the back buffer to the shared
front pixmap, we return and damage is posted and the compositor grabs
the window pixmap and displays it.

So does it work?
On my
blistering fast test system with X + xcompmgr running glxgears was
going at 150fps from the r200 PCI card. Hopefully I can get some time
on a faster system or one of the dual laptops.

Caveats:
- When a window manager is running the gears get all corrupted, this looks like the clipping and/or stride matching between
the
drivers isn't correct. I suspect something with reparenting and
decorations, I'm not enough of an X guru to understand this yet,
hopefully one of the other hackers can fill me in. Also before it gets
reparented and redirected a frame can land on the real front buffer,
again clipping should take care of this, but isn't working yet. I need
to workout how clipping and that stuff works in X/DRI2. - talk to ppl
about clipping then JDI.
- Once a client has connected as a prime,
we don't tear it down properly, so later clients can end marked as
prime. - work out some sort of resources to turn stuff off
-
Reference counting on the pages in the kernel is iffy, currently i915
ups the page list refcount but never drops it. solution JDI
- hardcoded /dev/dri paths in dri2 for slave device - solution JDI
- radeon driver could in theory be a prime master - solution JDI
- nouveau could support prime master/slave also. - solution nouveau guys JDI
-
requires an ugly second screen in xorg.conf to load the slave driver.
Can we have a 0 sized screen or maybe a rootless second screen. -
solution : rearchitect X server to allow drivers without screens
(6m-1yr work)
- pageflipping needs to be hacked off in intel driver. - work out and then JDI

Where is the video?
Once I get it working with a window manager on a useful machine I might do a video of two gears going.

Where now?
Well
this is a purely academic exercise so far, after a week of kernel
fighting I decided to do something new and cool. To make this as good
as Windows we need to seriously re-architect the X server + drivers. At
the moment you can't load an X driver without having a screen to attach
it to, I don't really want a screen for the slave driver, however I
still have to have one all setup and doing nothing and hopefully not
getting in the way. We'd need to separate screen + drivers a lot
better. Having some sort of dynamic screens would probably fall out of
this work if someone decides to actually do it.

The kernel bits
aren't as ugly as I thought but I'm not sure if upstreaming them is a
good idea without the others bits. The refcounting definitely needs
work also the cleanup when clients exit.

DRI2 needs some more changes, I might try and flesh it out a bit more and then talk to krh about a sane interface.

I'm
probably going to get forced task switch quite soon, so I might just
get to having this running on a W500 or T500, before dropping it for 6
months, so if anyone wants a neat project to play with and has the hw
feel free to try and take this on.


ASUS
feel free to send me one of the real optimus laptops and I'll get
nouveau guys hooked up and try and RE the nvidia DMA engine.

Thursday, 11 March 2010

NVIDIA Optimus and Linux: an update

Optimus is a way of taking over the processing of DirectX/OpenGL calls the moment they're made. Optimus works by leaving the Intel's Display Driver to display image on the screen and actively monitoring everything that is happening in relation to displaying image. The library of applications inside nVidia's driver will automatically react and switch to the GPU as soon as it detects application profile where nVidia's GPU would do much better than integrated graphics.

Using NVIDIA's Optimus technology, when the discrete GPU is handling all the rendering duties, the final image output to the display is still handled by the Intel integrated graphics processor (IGP). In effect, the IGP is only being used as a simple display controller, resulting in a seamless, flicker-free experience with no need to reboot. When less critical or less demanding applications are run, the discrete GPU is powered off and the Intel IGP handles both rendering and display calls to conserve power and provide the highest possible battery life.

The hardware component Optimus-capable GPUs is the "Optimus Copy Engine", a parallel pipeline next to the 3D Engine one. What Copy Engine does is to take the finalized rendered engine created by the 3D Engine and copy the contents from on-board memory to the system memory - which is then taken by Intel's IGP and displayed on frame-by-frame basis. The Optimus Copy Engine is a new alternative to traditional DMA (Direct Memory Access) transfers between the GPU framebuffer memory and main memory used by the IGP.

In the Microsoft Windows world, Optimus technology leverages Windows 7's ability to allow two independent graphics drivers to be active at the same time. The standard Intel graphics driver is used along with the NVIDIA driver because both display adapters operate independently. Looking within the Windows Device Manager, you'll see two display adapters listed even if Optimus has turned the GPU off.

The Linux community now needs people who are going to figure out how to activate two graphics drivers at the same time. Also, how to switch the mux between the integrated graphics and the discrete card. In nvidia/nvidia configurations, how to access the discrete ROM also needs investigating.

Although the open-source Nouveau community has been very active since their merge to the mainline kernel, nobody seems to have shown interest in getting Optimus hardware to work in Linux. From the users pointo of view, we already have Linux users with optimus laptops, willing to provide useful debugging information via the hybrid-graphics-linux Launchpad group:
https://launchpad.net/~hybrid-graphics-linux

Hopefully thinks will get better soon, and the usual lag that takes for Linux to implement features in the graphics world will be shorter than usual for Optimus-enabled laptops and desktops.

Friday, 5 March 2010

Asus PL30JT @ gadgets.softpedia.com

List of NVIDIA Optimus laptops @ blogs.nvidia.com

nTersect Blog - NVIDIA

ASUS is showing 12 systems with Optimus Technology in their booth:

  • U30Jc (13.3”)
    ASUS

  • UL50Vf (15.6”)
  • UL80Jt (14”)
  • U33Jc (13.3”)
  • U43Jc (14”)
  • UL30Jt (13.3”)
  • K52Jc (15.6”)
  • N82Jv (14”)
  • N61Jv (16”)
  • N71Jv (17”)
  • NX90 (18.4”)
  • 1201PN (12.1” with Next Generation NVIDIA Ion)


MSI is showing 2 Optimus notebooks.The new MSI F Series is designed to be professional, slim and powerful,
and the FX400 and the FX600 both feature an NVIDIA GeForce 310M GPUs
with NVIDIA Optimus technology:

FX4001




  • FX400 (14”)
  • FX600 (15.6”).  

Clevo is displaying 2 Optimus notebooks in their private meeting room suite:

  • B4100 (14”)
  • B5100 (15”).

And in NVIDIA’s Cebit demo rooms we’re showing several of the above
ASUS notebooks, plus 2 additional Optimus notebooks - the Medion Akoya
P6622 (15.6”) with GeForce 310M and the recently announced Acer Aspire
One 532g (10”) with next generation NVIDIA ION.
Optimus is on a roll.

Wednesday, 3 March 2010

NVIDIA Optimus on video @ www.engadget.com

NVIDIA's Optimus technology shows its graphics switching adroitness on video -- Engadget

Explaining automatic graphics switching and the benefits thereof can be
a somewhat dry affair. You have to tell people about usability
improvements and battery life savings and whatnot... it's much more fun
if you just take a nice big engineering board, strap the discrete GPU
on its own card and insert an LED light for the viewer to follow.
NVIDIA has done just that with its Optimus technology -- coming to a laptop or Ion 2-equipped netbook
near you -- and topped it off by actually pulling out the GPU card when
it wasn't active, then reinserting it and carrying on with its use as
if nothing had happened. This was done to illustrate the fact that
Optimus shuts down the GPU electrically, which is that little bit more
energy efficient than dropping it into an idle state. Shimmy past the
break to see the video.

Tuesday, 2 March 2010

vga_switcheroo code makes it into linux-next

The vga_switcheroo code has made it into linux-next, so we are one step closer to full upstream integration of switchable graphics features in Linux:

include/linux/vga_switcheroo.h                  |   57 +
 146 files changed, 18177 insertions(+), 6374 deletions(-)
 create mode 100644 drivers/gpu/drm/drm_buffer.c
 create mode 100644 drivers/gpu/drm/nouveau/nv50_grctx.c
 create mode 100644 drivers/gpu/drm/radeon/evergreen.c
 create mode 100644 drivers/gpu/drm/radeon/evergreen_reg.h
 create mode 100644 drivers/gpu/drm/radeon/radeon_atpx_handler.c
 create mode 100644 drivers/gpu/drm/radeon/reg_srcs/r600
 create mode 100644 drivers/gpu/vga/vga_switcheroo.c
 create mode 100644 include/drm/drm_buffer.h
 create mode 100644 include/linux/vga_switcheroo.h

Followers