You might notice that on the right hand side of the diagram there is a coherent connection to the CPU and main memory. If this was a simple audio block there would be no need for CPU coherency, nor the AXI bridge that links everything together. AXI is the ARM bus so why would you put that in an audio unit? Think there is more to this than beeps now?
The audio block was completely unique. That was designed by us in-house. It's based on four tensilica DSP cores and several programmable processing engines. We break it up as one core running control, two cores running a lot of vector code for speech and one for general purpose DSP. - See more at: http://www.dailytech.com/Microsoft+Clai ... ycP4g.dpuf
The exact count is a bit nebulous though, Microsoft claims “15 special purpose co-processors” not counting the CPUs and GPUs, eight of which are audio.
First things first, let’s talk field-programmable gate arrays (FPGA). As the name implies, an FPGA is essentially a blank chip that can be repeatedly reprogrammed after manufacturing. With very few exceptions, every chip inside your computer is hard-coded (at the time of manufacturing) to perform just one set of functions. Your CPU can only do exactly what Intel or AMD designed it to do. You can’t take your CPU and turn it into a GPU. But you can take an FPGA, program it to perform one set of functions (say, graphics), and then reprogram it to handle another type of workload (say, sorting through databases).
The main advantage of an FPGA, other than its customizability, is that it has monstrously high performance. In much the same way that an ASIC is by far the fastest and most efficient way of processing a specific workload (and thus why they’re used for Bitcoin farming), an FPGA is also very fast and efficient. They’re not quite as fast or efficient as ASICs, but what you lose in speed you gain in reprogrammability (again, ASICs are set in stone at manufacturing time).
Rather than banking on scaling to many, many more cores, let's take a different path," Microsoft researcher Doug Burger told El Reg. "We think specialization is going to be the next big thing – specializing hardware for different workloads."
the problem with FPGA is they are harder than GPU so MS have to be provide the framework & tools to support it
it is also why HSA also adopts DSP/FPGA not just CPU/GPU generally, DSP/FPGA of X1 audio block for example is just like PS3 Cell , a fusion of SIMD/vector and scalar
This is a effectively a super-computer design," he says. "This is a design out of the super-computer realm. So I expect that we're going to continue to see fairly large improvements in GPU output as people really tune these data sets."
cigi silk skrev:Positive nyheder for xbox one og anvendelsen af eSram
Yebis 2 Uses Xbox One eSRAM For Buffer, PS4 Unified Architecture Irrelevant Since Yebis Is GPU-Based
Read more at http://gamingbolt.com/yebis-2-uses-xbox ... 9cKXy7T.99
The power disparity from either GPU is by no means that much of a difference. The point he is making is that the GDDR5 is THE main memory for the whole PS4 APU with a unified architecture. ESRAM is specifically a bandwidth feeder for the XOne's GPU, and has more bandwidth given developers use it aside from the conventional method of rendering from the GPU.
Kilde: Yebis http://www.siliconstudio.co.jp/middleware/yebis/en/
Optics post effect middleware
The Hybrid Memory Cube Consortium, a group of memory industry giants led by Micron and Samsung, has announced a new member: software behemoth Microsoft.
The Hybrid Memory Cube (HMC) technology espoused by the group is a planned implementation of through-silicon via (TSV) technology - vertical conduits through a chip's silicon infrastructure which allows components to be placed in a three-dimensional mesh rather than in a traditional planar manner - which promises to dramatically improve the performance of future memory modules.
Prototypes shown off by Micron earlier this year have already proved more than capable of taking over from traditional DRAM components, showing peak throughput of 128GB/s compared to the 12.8GB/s from commercial-grade DDR3 modules created on a planar process.
It's not all about performance, however: the process also promises dramatic power savings for mobile gadgets, with Micron's prototype modules showing a 70 per cent reduction in power draw during data transfer in a module one-tenth the size of current-generation technologies.
The technology is impressive enough to have won a stack of awards, including the Linley Group's Best New Technology Award in its 2011 round-up. Thus far, however, it is notable in its absence from the commercial markets. While there is no timescale available on when the product may launch, the fact that Microsoft is showing an interest suggests it's not too far away from becoming a commercial reality.
'HMC technology represents a major step forward in the direction of increasing memory bandwidth and performance, while decreasing the energy and latency needed for moving data between the memory arrays and the processor cores,' claimed Microsoft's general manager of strategic software/silicon architectures KD Hallman in an announcement to press. 'Harvesting this solution for various future systems could lead to better or novel digital experiences.'
A High Memory Bandwidth FPGA Accelerator for Sparse Matrix-Vector Multiplication Read the publication
Jeremy Fowers, Kalin Ovtcharov, Karin Strauss, Eric Chung, and Greg Stitt, in International Symposium on Field-Programmable Custom Computing Machines, IEEE [May 2014]
Sparse matrix-vector multiplication (SMVM) is a crucial primitive used in a variety of scientific and commercial applications. Despite having significant parallelism, SMVM is a challenging kernel to optimize due to its irregular memory access characteristics. Numerous studies have proposed the use of FPGAs to accelerate SMVM implementations. However, most prior approaches focus on parallelizing multiply-accumulate operations within a single row of the matrix (which limits parallelism if rows are small) and/or make inefficient uses of the memory system when fetching matrix and vector elements. In this paper, we introduce an FPGA-optimized SMVM architecture and a novel sparse matrix encoding that explicitly exposes parallelism across rows, while keeping the hardware complexity and on-chip memory usage low. This system compares favorably with prior FPGA SMVM implementations. For the over 700 University of Florida sparse matrices we evaluated, it also performs within about two thirds of CPU SMVM performance on average, even though it has 2.4x lower DRAM memory bandwidth, and within almost one third of GPU SVMV performance on average, even at 9x lower memory bandwidth. Additionally, it consumes only 25W, for power efficiencies 2.6x and 2.3x higher than CPU and GPU, respectively, based on maximum device power.
We would like to thank Doug Burger and the Catapult team for their support and help with this project. We would
also like to thank Adrian Macias from Altera for helping with the accumulator implementation.
FPGAs are a complimentary, semi-custom, co-processing resource that is "picking off" the parallelizable tasks from CPUs. FPGAs do this – at lower clock speeds and power – by deploying multi-core parallelism.
HPRC (High Performance Reconfigurable Computing) as a branch of Computer Science is thriving. Largely driven by GPGPU (general-purpose graphics processing unit) growth, HPRC is also supported by FPGA-based applications. The programming environment is considered to be the main obstacle preventing FPGAs from being used to their full potential in accelerators.
FPGAs allow designers to change their designs very late in the design cycle– even after the end product has been manufactured and deployed in the field. In addition
Think of it all as nano coding. The philosophy of this is future software paradigm which requires hardware to be designed around it. Much here in line with what MS has said technically behind its cloud infrastructure, X1 design, upcoming (new) DX12 GPUs, etc.. The future is here, just takes time to code it all.
People really fail to realize that MS spends more money on research annually than most of its competitors do put together in the hardware/software space around computing. It's why they (MS) have pretty much just given up explaining themselves and just letting things evolve and show as they come naturally while eventually everyone else will scramble to try to catch up. It is why Intel, AMD and NVidia ate all excited and part of what is coming around the bend; DX12 being a huge part of it all with new GPU architectures built around it. While existing cards will be compliant with DX12, they will not be able to touch new cards built around what DX12 is really about.
It's all related, all next gen technology and software design.
Jibla skrev:Sgu da vildt at man får 720p ud af alle de jpeg billeder!
Brugere der læser dette forum: Ingen og 6 gæster