IBM®
Skip to main content
    Country/region [select]      Terms of use
 
 
    
     Home      Products      Services & solutions      Support & downloads      My account     
developerWorks  >  Blogs  >   developerWorks

author Cell Broadband Engine/Power Architecture notebook

This web log is the product of the collaborative, innovative, virtual minds of the editors of the IBM developerWorks Multicore acceleration (Cell/B.E. SDK) zone.



Tuesday June 17, 2008

Clear conference calendars: Register for Cell/B.E. apps workshop

Workshop coming in July at GA Tech: This two-day workshop (July 10-11, 2008 | agenda) will cover from ray tracing to LANL's Roadrunner, from applications on low-cost Cell/B.E. clusters to computer vision and digital imaging. It will address programmability issues like language and compiler, programming models and common runtime, and ISV programmability framework and tooling. There is no charge to attend; registrants must be registered by June 30, 2008. Please see disclaimer on use of "LANL Roadrunner" name.


Return to zone ||| Return to blog ||| Previous postings



Categories : [   Cell  |  events  ]

Jun 17 2008, 02:10:00 PM EDT Permalink


Tuesday June 17, 2008

Design on a dime: Beyond 45nm, is multithreading dead?

From DAC: Is multithreading really the best way to exploit multicore systems effectively?: A concerning question popped up at the recent 45th Design Automation Conference: "Is multithreading really the best way to exploit multicore systems effectively?" This reflected the efforts EDA vendors have been putting into adding mthreading capabilities to their tools to help with multicore design; problem is, at the 45nm node, more designs climb over the 100 million-gate mark and break current IC CAD tools. Parallel processing has traditionally relied on threads but threads sort of start bottoming out at four processors.

Read the detailed report to see what some of the best thinkers in the industry think about this question, including Gary Smith of Gary Smith EDA -- he thinks threads are dead: "It is a short-term solution to a long-term problem. Library- or model-based concurrency is the best midterm approach."


Return to zone ||| Return to blog ||| Previous postings



Categories : [   general  |  news  ]

Jun 17 2008, 12:34:00 PM EDT Permalink


Tuesday June 17, 2008

Nano, nano: Put tiny satellite in orbit, win a prize

You have until September 2011: The N-Prize ("Nanosatellite"/"Negligible Resources") is a competition to stimulate innovation around inexpensive access to space. To compete, you must launch a satellite weighing between 9.99 and 19.99 grams into Earth orbit and track it for a minimum of nine orbits. It must not cost more than US$2000 must be done before 19:19:09 (GMT) on September 19, 2011. The prize is about US$19,000.


Return to zone ||| Return to blog ||| Previous postings



Categories : [   general  |  news  ]

Jun 17 2008, 12:32:00 PM EDT Permalink


Tuesday June 17, 2008

Oddments: Qentangled images capture for first time

Entanglement on film: Quantum entangled images, in this case two random pictures physically separated but linked through their complementary features, have been captured in real time by researchers at the Joint Quantum Institute. They did it by using linked laser beams originating from a single point that produces twin images (a cat face, one inverted and the other backwards) at separate locations. For more on qentanglement, see "Storing nothing and doing it right!," "Photon encoding breaks record," "'It's your mother calling yesterday'," "Qentanglement goes where no man ...," "Honey, get out the Qentanglement photo album," and "Tangled up in that quantum net."

Saucy algorithm exploits symmetries to crack combinatorial problems: The torture level of the scourge of design automation math -- combinatorial problems like "what is the shortest route to send an Internet message around the world?" -- has been reduced by a new "saucy" algorithm. The Saucy algorithm's developers claim it can solve combinatorial problems by finding symmetries among large swaths of possibilities. (Symmetries are mathematical equivalent branches of a search, interchangeable options that lead to the same outcome so they only need to be calculated once. If you ID all the symmetries in a set before you start comparing outcomes, you can eliminate lots of "duplicates.") They claim that in a test of the previously mentioned Internet message problem, it found an optimum path in under a second.


Return to zone ||| Return to blog ||| Previous postings



Categories : [   general  |  news  ]

Jun 17 2008, 12:31:00 PM EDT Permalink


Tuesday June 17, 2008

Product watch: Debugger and deskside work with Cell/B.E. systems

Deskside lets you work with Cell/B.E. code too: Terra Soft's quad-core 970 PowerStation (a four-way SMP system based on the PowerPC 970MP Processor and the CPC945 North Bridge Chip) is a deskside workstation/server that also may be used to prepare and optimize code for Cell/B.E. systems (in fact, Yellow Dog Linux includes the IBM SDK for Multicore Acceleration which is installed by default). You can even use the PowerStation to develop code for and manage clusters built on PS3s or the high performance IBM BladeCenter QS22 systems.

TotalView Debugger gets Cell/B.E. support: Blue Gene/P support too. TotalView Technologies TotalView 8.5 source code debugger now lets users debug Cell Broadband Engine architecture applications (as well as delivers enhanced IBM Blue Gene/P support). It supports Linux systems using the IBM Cell/B.E. SDK (SDK 2.1 on FC6 and SDK 3.0 on Fedora 7/RHEL 5.1).


Return to zone ||| Return to blog ||| Previous postings



Categories : [   Cell  |  news  ]

Jun 17 2008, 12:29:00 PM EDT Permalink


Tuesday June 17, 2008

Programming with BLAS: SPE thread creation

  Programming with BLAS: SPE thread creation (SDK 3.0) INFObomb
A quick read on how the default SPE management routines can enable SPE thread creation; for the IBM SDK for Multicore Acceleration 3.0   More INFObombs

When a pre-built BLAS application binary (executable) is run with the BLAS library, the library internally manages SPE resources available on the system using the default SPE management routines. This is also true for the other BLAS applications that do not intend to manage the SPEs and want to use default SPE management provided by the BLAS library.


Example application


The sample application that invokes the BLAS-PPE library (from "Programming with BLAS: Using the PPE interface library") -- which invokes the scopy and sdot routines -- is an example of the default SPE management routines.

#include <blas.h>
#define BUF_SIZE 32

/********************** MAIN ROUTINE **********************/
int main()
{
    int i,j ;
    int entries_x, entries_y ;
    float sa=0.1;
    float *sx, *sy ;
    int incx=1, incy=2;
    int n = BUF_SIZE;
    double result;

    entries_x = n * incx ;
    entries_y = n * incy ;

    sx = (float *) _malloc_align( entries_x * sizeof( float ), 7 ) ;
    sy = (float *) _malloc_align( entries_y * sizeof( float ), 7 ) ;

    for( i = 0 ; i < entries_x ; i++ )
         sx[i] = (float) (i) ;
    j = entries_y - 1 ;
    for( i = 0 ; i < entries_y ; i++,j-- )
            sy[i] = (float) (j) ;

    scopy_( &n, sx, &incx, sy, &incy ) ;
    result = sdot_( &n, sx, &incx, sy, &incy ) ;
    return 0;
}

Control with environmental variables


For such applications, you can partially control the behavior of BLAS library by using certain environment variables. There are many environment variables available to customize the launching of SPE and memory allocation in the BLAS library, but for full control you can register and use your own SPE and memory callbacks. Here are the environment variables:

  • BLAS_NUMSPES: Specifies the number of SPEs to use. The default is eight (SPEs in a single node).
  • BLAS_USE_HUGEPAGE: Specifies if the library should use huge pages or heap for allocating new space for reorganizing input matrices in BLAS 3 routines. The default is to use huge pages. Set the variable to 0 to use heap instead.
  • BLAS_HUGE_PAGE_SIZE: Specifies the huge page size to use in KB. The default value is 16384KB (16MB). The huge page size on the system can be found in the file /proc/meminfo.
  • BLAS_HUGE_FILE: Specifies the name of the file to be used for allocating new space using huge pages in BLAS 3 routines. The default filename is /huge/blas_lib.bin.
  • BLAS_NUMA_NODE: Specifies the NUMA node on which SPEs are launched by default and memory is allocated by default. The default NUMA node is -1 which indicates no NUMA binding.
  • BLAS_SWAP_SIZE: Specifies the size of swap space in KB. The default is not to use swap space.
  • BLAS_SWAP_NUMA_NODE: Specifies the NUMA node on which swap space is allocated. The default NUMA node is -1 which indicates no NUMA binding.
  • BLAS_SWAP_HUGE_FILE: Specifies the name of the file that will be used to allocate swap space using huge pages. The default filename is /huge/blas_lib_swap.bin.

For more on environmental variables, see "Programming with BLAS: Tuning the library for performance."


Taken from the Basic Linear Algebra Subprograms Programmer's Guide and API Reference. Download the SDK 3.0. Check out some reference guides in the Cell Resource Center SDK library.

   ORIGINAL DOCUMENTATION | DOWNLOAD SDK 3.0 | SDK 3.0 LIBRARY | MORE INFObombs | BACK to BLOG | BACK to ZONE


Categories : [   Cell  |  infobombs  ]

Jun 17 2008, 12:20:00 PM EDT Permalink



Tuesday June 10, 2008

Faster than a speeding bullet: Breaking the pflops barrier

LANL Roadrunner earns its name: Seems LANL's Roadrunner is now poised to take its place as the fastest supercomputer in the world -- think a stack of 100K laptops about one-and-a-half miles tall. In the Roadrunner, two IBM QS22 blade servers and one IBM LS21 blade server are combined into a specialized tri-blade configuration (which can run 400gflops) for a total of 3,456 tri-blades. Standard processing like file system I/O is taken care of by the Opteron processors while math-/CPU-intensive tasks go to the Cell/B.E. processors. (There are more interesting facts in the Roadrunner fact sheet.)

Even the New York Times is getting in on the story: "If all six billion people on earth used hand calculators and performed calculations 24 hours a day and seven days a week, it would take them 46 years to do what the Roadrunner can in one day." Other coverage includes:

You got the hardware; what about the software?: This EE Times article discusses efforts to enable all sorts of software to take advantage of the speed of multicore systems. Buddy Bland, project director for a major supercomputer center at Oak Ridge National Lab (which hopes to install its own pflops system this year), noted that "getting applications to scale is our biggest challenge" and goes on to add that "it turns out you get just as much advancement from better software and algorithms as you do from better hardware." Oak Ridge has been testing such parallel programming languages as IBM's X10, Cray's Chapel, and Sun's Fortress.

Bill Thigpen, chief of supercomputing engineering at the NASA Ames Research Center, has observed an increasing gap between the rate at which benchmark performance is rising and the increases in the ability to do actual work: "One of the challenges is being able to get the available work out of the theoretical performance peak." He goes on to note that scaling is a challenge: "Communications becomes a bigger part of your work. If you spend increasing time passing information between the processors, the processors are not doing as much work on the real issue."

The article goes on to illuminate why the important thing researchers learn from the LANL Roadrunner may not have to do with speed but with how the heterogeneous processors interact.

UPDATE 06/12/08: Panel on LANL's Roadrunner at ISC08: In a special panel session at the International Supercomputing Conference (June 17-20, Dresden; session on June 18) entitled "RoadRunner: The First Petaflop/s System in the World and its Impact on Supercomputing," two leaders of the drive to build Roadrunner -- Dr. Andrew White from Los Alamos and Dr. Don Grice of IBM -- will be joined by HPC experts to discuss the impact the system will have on the world of computing. Included are

  • Lawrence Berkeley National Laboratory's Dr. Erich Strohmaier on "All #1 Systems in the TOP500 So Far."
  • Drs. Grice and White on Roadrunner's hardware and software architecture and applications.
  • University of Tennessee/Oak Ridge National Laboratory's Dr. Jack Dongarra on "Roadmap to Exaflop/s in the Year 2019."
  • Reactions and comments on the achievement from the US, Europe, and Asia.
  • An audience question period.

Other conference highlights include

  • HPC and next-generation climate modeling.
  • The past year in perspective.
  • Harnessing the potential of multicore/manycore processors.
  • Deciding whether HPC is going green.
  • HPC challenges and opportunities in the era of the petaflops.
  • And you get to grill leading HPC vendors!

Please see disclaimer on use of "LANL Roadrunner" name.


Return to zone ||| Return to blog ||| Previous postings



Categories : [   Cell  |  events  |  news  ]

Jun 10 2008, 01:08:00 PM EDT Permalink


Tuesday June 10, 2008

Games people play: How to install Ubuntu on your PS3

Tutorial teaches you to install Ubuntu on your PS3. Enough said.


Return to zone ||| Return to blog ||| Previous postings



Categories : [   Cell  |  papers  ]

Jun 10 2008, 01:03:00 PM EDT Permalink


Tuesday June 10, 2008

It came from the Lab: Processors that are waterfall cool

IBM water cools 3D chips: IBM Research Zurich has demonstrated 3D processor stacks that are cooled (to a rate of 180W per layer) with water flowing down 50micron channels between the chips. With 3D chip stacks, enough heat can get trapped between the layers to melt the cores -- on the back of each layer, etched into silicon oxide, is an aqueduct. (Eventual plans are for memory chips between processor cores to increase interconnections times 100 and reduce feature size by a factor of 10.) Commercial release target is 2013.


Return to zone ||| Return to blog ||| Previous postings



Categories : [   general  |  news  ]

Jun 10 2008, 01:02:00 PM EDT Permalink


Tuesday June 10, 2008

Trends and tradeoffs: Real community collaboration development

IBM AlphaWorks: From software theory to fact: This ZDnet UK profile is a real tribute to the fine work of IBM alphaWorks, that of providing a place for the developer commmunity to preview and collaborate on emerging technology from IBM's research labs (and, of course, turn them into commercial products). To date, alphaWorks has had 40 percent of the technologies showcased migrate into IBM products; the site also provides more than 200 downloads for developers. Technologies that really take off for developers often get picked up then by developerWorks (which goes on to build a highly interactive center of theoretical and practical resource material, Q&A forums, code exchanges, blogs, podcasts, etc.). The author interviews alphaWorks senior software engineering manager Laura Bennett who talks about the "next big things" in technology: Software-as-a-service, Web 2.0 and collaboration, Semantic Web, rapid application development, data visualization, and health care. Bennett also discusses some alphaWorks technologies that have had a significant impact on IBM like the Cell Broadband Engine (blade servers, supercomputers, game consoles), the Unstructured Information Management SDK (informational semantics), and autonomic computing technologies (which have almost "disappeared" as standalone topics because of its high rate of integration into such IBM products as Tivoli, as well as into third-party products).


Return to zone ||| Return to blog ||| Previous postings



Categories : [   Cell  |  general  |  news  ]

Jun 10 2008, 01:00:00 PM EDT Permalink


Tuesday June 10, 2008

Conventional Wisdom alert: The next ubiquitous toolchain for embedded

Is it possible that open source Eclipse might win over proprietary?: EE Times contributing technical editor Richard A. Quinnell highlights a gradual shift that has been going on for years -- product announcements and initiatives that point to the Eclipse Framework establishing a dominant position as number-one embedded tool chain. Many embedded devtool vendors like Mentor Graphics, QNX Systems, and Wind River are adapting their tools for use with Eclipse. And the Eclipse Foundation is stepping up its pursuit of device development via four initiatives to enhance its Device Software Development Platform: Real-time software components, Windows Embedded CE support, the Eclipse device-debugging project, and the target communications framework. The Eclipse Framework is a set of open-source components that can be combined to form a software development tool suite that includes basic editors, compilers, debuggers, and a user interface; it can be configured as an IDE for languages such as Java and C/C++ (and soon, Ada) by using the relevant components. (Release 1.0 of the debugging project will be available in the next Eclipse Framework Ganymede release -- see developerWorks Eclipse project resources to keep the up-to-datiest on this.)


Return to zone ||| Return to blog ||| Previous postings



Categories : [   general  |  news  ]

Jun 10 2008, 12:58:00 PM EDT Permalink


Tuesday June 10, 2008

Nano, nano: Paper of steel

Nanocellulose makes iron look like a wimp: Regular paper is made from a crystalline polymer of glucose called cellulose; the process to make it generates quite long microfibers that are full of defects that can break apart when stressed. Swedish Royal Institute researchers have figured out how to keep the cellulose fibers small (about 1000 times smaller than regular paper) and relatively defect-free; they then coated them with carboxymethanol which readily forms hydrogen bonds that help fibers make tight contacts with one another. This new paper has a tensile strength about seven times greater than regular paper and over one-and-a-half times greater than cast iron. The researchers say that beyond the obvious uses, these fibers could replace carbon in reinforced plastics construction (cheaper and better) and it is easier and cheaper to dry, making it cheaper to produce.


Return to zone ||| Return to blog ||| Previous postings



Categories : [   general  |  news  ]

Jun 10 2008, 12:56:00 PM EDT Permalink


Tuesday June 10, 2008

Oddments: Ped power

You've heard of "voting with your feet": Will your footsteps now be used to generate electricity? Underfloor generators may be the next big thing in every public place you go. The pressure of your footfalls compresses pads under the flooring, driving fluid through tiny turbines which generate electricity which is then stored in batteries. Researchers have calculated that 34,000 striders an hour can power 6,500 lightbulbs. And generation is not limited to heel strikes -- any movement a structure makes can be converted into power. Recently, trains passing over a Midlands UK railway bridge generated more than enough electricity to power a flood detector. Any building or towering structure that sways in the wind is a candidate for making a little extra charge, too. And one more thing about the walking power plant -- it can do double duty. With the technology embedded in both a floor and the heel of a shoe, the walker can power room lights and recharge his own personal electronics.

Robots: Robofish keep track of each other without creator prompting. The Game of Life from Duke researcher (but with robots).

Futuretech: A long-awaited device, Computerworld tracks the near-reality of e-paper.

Why time moves in a line from yesterday to tomorrow (and other juicy bits): Caltech researchers have a new model of our universe, one that may contain a signature of a time before the Big Bang and could explain why time moves in a straight, one-way line. In the new model, fluctuations in the cosmic background radiation (considered proof of the Big Bang and thought to be the seeds that galactic clusters grew from) might be evidence that our universe was "pinched off" an existing parent universe. The model also postulates that:

  • Universes can spontaneously generate from empty space.
  • Spontaneous generation of a universe is probably not a spectacular event. Co-author of the model Professor Sean Carroll says a "universe could form inside this [a] room and we’d never know."
  • Originally, one-way time movement (known as the "arrow of time") was attributed to the second law of thermodynamics which insists that systems move over time from order to disorder. This model depends on the major assumption that the universe started life in an ordered state.

Return to zone ||| Return to blog ||| Previous postings



Categories : [   general  |  news  ]

Jun 10 2008, 12:54:00 PM EDT Permalink



Friday June 06, 2008

Programming with BLAS: Five maximum performance tips

  Programming with BLAS: Five maximum performance tips (SDK 3.0) INFObomb
A quick read on five tips to gain maximum library performance; for the IBM SDK for Multicore Acceleration 3.0   More INFObombs

These five tips will let you leverage maximum performance from the BLAS library.


128byte-aligned


Make the matrices and vectors 128byte-aligned: Memory access is way more efficient when the data is 128byte-aligned.


Huge pages


Use huge pages to store vectors and matrices. By default, the library uses this feature for memory allocation done within the library.


NUMA binding


Use NUMA binding for the application and the library. Set the BLAS_NUMA_NODE environment variable (a quick look is in Tuning the library for performance) to enable this feature for the library. BLAS_NUMA_NODE can be set to 0 or 1 for a dual-node system. An application can enable NUMA binding either using the command-line NUMA policy tool numactl or NUMA-policy API libnuma provided on Linux.


Swap space


Use the swap space feature (quickly described in Tuning the library for performance) for matrices smaller than 1KB with appropriate NUMA binding.


Start with the right numbers


The library gives better performance while working on vectors and matrices of large sizes. Performance of optimized routines is better when the stride value is 1. Level 3 routines show good performance when the number of rows and columns are a multiple of 64 for single precision (SP) and 32 for double precision (DP).


Taken from the Basic Linear Algebra Subprograms Programmer's Guide and API Reference. Download the SDK 3.0. Check out some reference guides in the Cell Resource Center SDK library.

   ORIGINAL DOCUMENTATION | DOWNLOAD SDK 3.0 | SDK 3.0 LIBRARY | MORE INFObombs | BACK to BLOG | BACK to ZONE


Categories : [   Cell  |  infobombs  ]

Jun 06 2008, 06:31:00 PM EDT Permalink



Tuesday June 03, 2008

It came from the Lab: Grand Theft Auto and life sciences research

How does GTA4 further R&D: Follow this logic. To say Grand Theft Auto IV is extremely popular is like saying the sun will definitely rise in the east in the morning. What makes it popular (this is for people that have never engaged it)? The graphics and animation are mindblowing. (In fact, in the US TV show Saturday Night Live, two actors play two of the mobsterlike characters and it is almost a tough task to tell them apart from their synthetic versions. Maybe the movement of the animated ones is smoother and more realistic.) Reception of the game has demonstrated to both the gaming industry and semiconductor manufacturing industry that there is a trememdous market for the hardware and software to bring "real" to the virtual world. Once this level of processing power is in place, it can then be easily used for life (or other) sciences research:


Return to zone ||| Return to blog ||| Previous postings



Categories : [   Cell  |  news  ]

Jun 03 2008, 12:52:00 PM EDT Permalink

Previous month
  June 2008
Next month
S M T W T F S
1234567
891011121314
15161718192021
22232425262728
2930     
       
Today

RSS for

RSS for

Favorites
Cell Broadband Engine Architecture forum
Cell Resource Center
IBM microNews newsletter
Multicore acceleration zone (formerly Power Architecture)
alphaWorks Cell technologies

Categories
Cell (117)
disclaimer (1)
downloads (23)
events (139)
forums (20)
general (134)
infobombs (48)
infobombs theme (3)
misc (6)
news (849)
newsletter (3)
papers (140)
tech update (1)

Recent Entries
Clear conference calendars: Regi...
Design on a dime: Beyond 45nm, i...
Nano, nano: Put tiny satellite i...
Oddments: Qentangled images capt...
Product watch: Debugger and desk...
Programming with BLAS: SPE threa...
Faster than a speeding bullet: B...
Games people play: How to instal...
It came from the Lab: Processors...
Trends and tradeoffs: Real commu...
Conventional Wisdom alert: The n...
Nano, nano: Paper of steel
Oddments: Ped power
Programming with BLAS: Five maxi...
It came from the Lab: Grand Thef...

Blogs I read
CellPerformance.com
Power.org blog
Slashdot

Special offers
Save on Rational testing software
Download trial versions of popular IBM software
Register for the DB2 Information Management Technical Conference

More offers


 
    About IBM Privacy Contact