PS3 Fault finding YLOD with the SYSCON - First steps and Error reporting

Did someone tested any thermal config accelerated for slims? I've been looking for early posts but can't remember /find if I did it or someone else said/did it or was about to do and didn't got time?
Edit
As requested I would like to modify fan speed of slim dyn001 board that will work with ofw and psn account, similar as webman.
30%speed or 35 %.
I will try tomorrow to remember /test by reading parts on different threads and tests I've done. This situation is bearly required so didn't pay attention until now. Just did rush test and quick forgot then.
Edit 2
Got info needed for tests in syscon fan setings thread thanks to sandungas!
 
Last edited:
Well, I know the script requires some dependencies. Pyserial and pycryptodome. I'm not sure if you can get them working with Rpi like you can in python on windows.

The adapter is cheap. So why not?

Just more convenient to use something that I already have.
But no need to buy it, I managed to get it working with the rpi4 :D.
Just needed to connect them to a common ground, then I got a successful auth with the syscon.
So, what commands I need to use now to get the logs while the PS3 is on/turning off?
 
Last edited:
By the way @RIP-Felix on the pqx-001 motheboard emmc model is the vrm on top of cell and rsx and can I just replace the parts from a dead board probably one that been taken to direct reflow . I pretty myc think playing ps2 classics did the strain on my boards vrm .I remember reading cell and rex both have to work too much for playing ps2.
Didn't got my answer @RIP-Felix also l
as you told me my errors are nec related but as far as I have gone around threads,wikis,guide and errorlogs they seen much of chip related errors I mean people are getting errors after replacing necs and also 1002 is a vram failure.I just hope change caps get it running.
Please use the schematic to cross-reference everything. If you are going to fix electronics you need to get into the habit of looking at schematics and datasheets anyway.
Not gonna be a problem anyway I do that alot.Things been be much eaiser using them lately Still gets allot fustrating if schematics are not available.
 
So, what commands I need to use now to get the logs while the PS3 is on/turning off?
I can't remember if you said this was a mullion SYSCON or Sherwood? Anyway, if it's a mullion you'll need to gain internal SYSCON access to use some of the commands. Sherwood doesn't, but may or may not have the same commands. IDK, I don't have any.

Bringup
will start the console. After it's on the XMB (main menu), press enter to see more of the Boot string. That should show a sucessful power on sequence (POS) followed by the bootloader.

Then use command's powerstate, errlog, and becount. powerstate should show the systems that are on/off and errlog is obvious. With internal access on mullion, there will be a timestamp for the errors. So we can see if they are new or old errors. You should back that up as soon as you get a console, because it tells the console history and can be helpful to piece together what might be wrong. becount will show the total startup (bringup) and shutdown counts. That can be useful to see how the console was used and how much. It also shows the total uptime (what was recorded - it doesn't record uptime if there was an unextected shutdown, which is often the case in data farms where they hard shutdown the console or it only ever was shutdown during a power outage).

Then use shutdown to turn the console off. If it hangs and generates an error during step number 90, then we may see something useful in the log before it stops sending data.

Hopefully, your RPi terminal has enough memory to retain the session log of the entire string. Copy and paste it to a txt document and then use the insert code function on the forum toolbar to paste the string here.
 
Didn't got my answer @RIP-Felix also l
as you told me my errors are nec related but as far as I have gone around threads,wikis,guide and errorlogs they seen much of chip related errors I mean people are getting errors after replacing necs and also 1002 is vram failure.I just hope change caps get it running.
Not real sure what you mean.

I finally got my ps3devwiki account working. I just uploaded some notes I've been making to my copy of the SYSCON error code PDF. I agree that it was confusing they say RSX VRAM Power Fail when I've already proven it can be caused by RSX_VDDC filter caps. I know for a fact that bad tokins on VDDC can cause 1002 errors. However, I do not know if that's the only way 1002 errors occur.

If SONY say's its a VRAM power issue, then there's probably a reason! There is something you can try before taking off the tokin though. And it might help us test a hypothesis. Referr to these pictures for what I'm talking about:
RSX PWR Flowchart.png
RSX Pinout (Board View).jpg
COK-001 PWR FlowChart.jpg
@M4j0r explained this to me before in the frankenstein thread. I thought that VRAM was the voltage to the RSXRAM modules and that it is supplied by VDDR. But its actually supplied by the 1.8V_FBVDDQ and 1.2V_VDDR is supplying the voltage for SPI communication across the FlexIO. That makes sense.
VDDR is for the Rambus FlexIO Core. FBVDD is for the VRAM.
So when the PDF says VRAM Power it's talking about the 1.8V_VDDQ rail in the upper right hand corner. I never tested that. I haven't tried populating those other 2 tantalum pads to see if it clears up a 1002 error. I would like someone with a 1002 error to probe there with an oscilloscope to characterize the noise/rippl before and after populating the pads. I wonder if it would affect the 1002 error at all. So you could test that hypothesis for us.

I don't see why replacing the tokins on VDDC would affect VDDQ though. And I have confirmed that bad tokins can cause 1002 errors. Perhaps it's because ripple/noise entering anywhere affects the VRAM and VDDC has a lot of current/load, meaning alots of noise/ripple to bleed over to the VRAM. Perhaps that can trigger VRAM Power issues, even thought it's not a problem on VDDQ PWR rail. Or maybe they are connected internally in some way...IDK.
 
Last edited:
Hi, the is the error log pulled from 2 PS3 with YOLD and would like to seek some help on the diagnosis.

I did check against the Syscon error log codes and this forum. CECHA01 is good candidate for NEC/Tokin replacement?

CECHA06 which I replaced the NEC/Tokin (with wrong tantalum cap with high ESR) and still have YOLD after being revived for less than a day. I plan to install the tantalizer with correct cap if error is related.
ERR 00: 00000000 A0093004 FFFFFFFF
ERR 01: 00000000 A0093004 FFFFFFFF
ERR 02: 00000000 A0093004 FFFFFFFF
ERR 03: 00000000 A0093004 FFFFFFFF
ERR 04: 00000000 A0093004 FFFFFFFF
ERR 05: 00000000 A0801001 0B48880E
ERR 06: 00000000 A0801001 0B488772
ERR 07: 00000000 A0801001 0B488735
ERR 08: 00000000 A0801001 0B4886B0
ERR 09: 00000000 A0801001 0B489524
ERR 10: 00000000 A0801002 0B49340E
ERR 11: 00000000 A0801002 0B4933B1
ERR 12: 00000000 A0801002 0B492AEC
ERR 13: 00000000 A0801002 0B492335
ERR 14: 00000000 A0801002 0B4887F0
ERR 15: 00000000 A0801002 0B4886F8
ERR 16: 00000000 A0801002 0B4886B9
ERR 17: 00000000 A0801002 0B488719
ERR 18: 00000000 A0801001 0B4886A8
ERR 19: 00000000 A0071002 29111D9B

CECHA01 with YOLD, haven't attempted any repair.
ERR 00: 00000000 A0101002 FFFFFFFF
ERR 01: 00000000 A0101002 0B48900A
ERR 02: 00000000 A0101002 0B489007
ERR 03: 00000000 A0101002 0B489005
ERR 04: 00000000 A0101002 0B489003
ERR 05: 00000000 A0101002 0B489000
ERR 06: 00000000 A0101002 0B488FFE
ERR 07: 00000000 A0101002 0B488FFB
ERR 08: 00000000 A0101002 0B488FF8
ERR 09: 00000000 A0101002 0B488FF5
ERR 10: 00000000 A0101002 0B488E03
ERR 11: 00000000 A0101002 0B488E01
ERR 12: 00000000 A0101002 0B488DFF
ERR 13: 00000000 A0101002 0B488DFD
ERR 14: 00000000 A0101002 0B488DFA
ERR 15: 00000000 A0101002 0B488DF8
ERR 16: 00000000 A0101002 0B488DF5
ERR 17: 00000000 A0101002 0B488DF2
ERR 18: 00000000 A0101002 0B488DF0
ERR 19: 00000000 A0101002 0B488DED

On 1 of the PS3, 3 pads Diag, Tx, Rx were ripped due to unfortunate mishandling :( and would like to know of alternative points if any, mostly for Rx which I cannot trace.

Thanks
 
Last edited:
Hello, everyone!

Maybe someone will be able to advice next steps to diagnose my PS3 issue?
I have: CECHC PS3 (COK-002 board) that one day just YLODed and never turned on again. It now has an instant YLOD - after being powered on it shutdowns in less than a second.
I inspected the board and I saw no immediate issues (other than 2 RSX replaced NEC/Tokin caps, but those were there before).

I got some logs from SYSCON, mainly 1004 error, which seems AC/DC power failure, no other clues. (3001 error fires up when I start console without 12v rail connected, so I guess, it can be ignored?)

Here are some logs:
> bringup
bringup
[SSM] state: 0000 -> 0101
Bringup Mode #0 (0xFF)
[SSM] ssmCb_OnStartingBePowOn() called.
[SSM] First Boot.
[SSM] Bringup mode : syspm_stat=00000000/00000000
[POWSEQ] PowerSeq_Setup called.
[SSM] fatalreq delayed.
[ERROR]: 0xa0081004
[SSM] state: 0101 -> 0201
[POWSEQ] AV Backend Setup
[SSM] *** Power Fail ***
[SSM] state: 0201 -> 0700
[POWSEQ] AV Backend Letup
[SSM] Shutdown mode : syspm_stat=00000000/00000000

> errlog
[POWSEQ] PowerSeq_Letup called.
[SSM] state: 0700 -> 0600
(PowerOff State) (Fatal)
errlog
ofst[ 96]:err_code:0xffffffff, clock:0xffffffff
ofst[100]:err_code:0xa0081004, clock:0xffffffff
ofst[104]:err_code:0xa0081004, clock:0xffffffff
ofst[108]:err_code:0xa0081004, clock:0xffffffff
ofst[112]:err_code:0xa0081004, clock:0xffffffff
ofst[116]:err_code:0xa0081004, clock:0xffffffff
ofst[120]:err_code:0xa0081004, clock:0xffffffff
ofst[124]:err_code:0xa0081004, clock:0xffffffff
ofst[ 0]:err_code:0xa0081004, clock:0xffffffff
ofst[ 4]:err_code:0xa0081004, clock:0xffffffff
ofst[ 8]:err_code:0xa0081004, clock:0xffffffff
ofst[ 12]:err_code:0xa0081004, clock:0xffffffff
ofst[ 16]:err_code:0xa0081004, clock:0xffffffff
ofst[ 20]:err_code:0xa0081004, clock:0xffffffff
ofst[ 24]:err_code:0xa0081004, clock:0xffffffff
ofst[ 28]:err_code:0xa0081004, clock:0xffffffff
ofst[ 32]:err_code:0xa0081004, clock:0xffffffff
ofst[ 36]:err_code:0xa0081004, clock:0xffffffff
ofst[ 40]:err_code:0xa0081004, clock:0xffffffff
ofst[ 44]:err_code:0xa0081004, clock:0xffffffff
ofst[ 48]:err_code:0xa0003001, clock:0xffffffff
ofst[ 52]:err_code:0xa0003001, clock:0xffffffff
ofst[ 56]:err_code:0xa0003001, clock:0xffffffff
ofst[ 60]:err_code:0xa0003001, clock:0xffffffff
ofst[ 64]:err_code:0xa0003001, clock:0xffffffff
ofst[ 68]:err_code:0xa0003001, clock:0xffffffff
ofst[ 72]:err_code:0xa0081004, clock:0xffffffff
ofst[ 76]:err_code:0xa0081004, clock:0xffffffff
ofst[ 80]:err_code:0xa0081004, clock:0xffffffff
ofst[ 84]:err_code:0xa0081004, clock:0xffffffff
ofst[ 88]:err_code:0xa0081004, clock:0xffffffff
ofst[ 92]:err_code:0xa0081004, clock:0xffffffff
[mullion]$
Not sure how to measure test points, because the console shuts off too quickly.
Any advice will be much appreciated.

Thanks
 
Last edited:
Not real sure what you mean.

I finally got my ps3devwiki account working. I just uploaded some notes I've been making to my copy of the SYSCON error code PDF. I agree that it was confusing they say RSX VRAM Power Fail when I've already proven it can be caused by RSX_VDDC filter caps. I know for a fact that bad tokins on VDDC can cause 1002 errors. However, I do not know if that's the only way 1002 errors occur.

If SONY say's its a VRAM power issue, then there's probably a reason! There is something you can try before taking off the tokin though. And it might help us test a hypothesis. Referr to these pictures for what I'm talking about: @M4j0r explained this to me before in the frankenstein thread. I thought that VRAM was the voltage to the RSXRAM modules and that it is supplied by VDDR. But its actually supplied by the 1.8V_FBVDDQ and 1.2V_VDDR is supplying the voltage for SPI communication across the FlexIO. That makes sense.

So when the PDF says VRAM Power it's talking about the 1.8V_VDDQ rail in the upper right hand corner. I never tested that. I haven't tried populating those other 2 tantalum pads to see if it clears up a 1002 error. I would like someone with a 1002 error to probe there with an oscilloscope to characterize the noise/rippl before and after populating the pads. I wonder if it would affect the 1002 error at all. So you could test that hypothesis for us.

I don't see why replacing the tokins on VDDC would affect VDDQ though. And I have confirmed that bad tokins can cause 1002 errors. Perhaps it's because ripple/noise entering anywhere affects the VRAM and VDDC has a lot of current/load, meaning alots of noise/ripple to bleed over to the VRAM. Perhaps that can trigger VRAM Power issues, even thought it's not a problem on VDDQ PWR rail. Or maybe they are connected internally in some way...IDK.
Ok so that means I'll have to poke the upper right corner solder BGAs with a needle probe to measure the voltage on a multimeter or use an oscilloscope to record noise activity.Then you just made me question myself why the hell I got in repairing things (just kidding).Anyways I can't do any of both as my multimeter has LCD ghosting problem and i don't own any complex repair tools no heatgun or oscilloscope or any other thing.That's gonna take a few months to get my hands on these complex repair tools.Got any cheap recommendations.
Once I purchase tools and make my own repair setup I'll be open to suggestions and experimental requests both on hardware and software levels till then it will be just paperwork and schematics comparing and stuff.
 
On 1 of the PS3, 3 pads Diag, Tx, Rx were ripped due to unfortunate mishandling :( and would like to know of alternative points if any, mostly for Rx which I cannot trace.

Thanks
That depends on how much far are you willing to go in repairing if the upper traces are ripped completely no slight trace then you'll have to possibly scratch the traces coming straight out of syscon.But are you willing to go that far if yes then you will need to post the photo of your syscon area all surronding components being clear in image.I'll pin the traces for tx,rx and diag in reply.
 
Ok so that means I'll have to poke the upper right corner solder BGAs with a needle probe to measure the voltage on a multimeter or use an oscilloscope to record noise activity.Then you just made me question myself why the hell I got in repairing things (just kidding).Anyways I can't do any of both as my multimeter has LCD ghosting problem and i don't own any complex repair tools no heatgun or oscilloscope or any other thing.That's gonna take a few months to get my hands on these complex repair tools.Got any cheap recommendations.
Once I purchase tools and make my own repair setup I'll be open to suggestions and experimental requests both on hardware and software levels till then it will be just paperwork and schematics comparing and stuff.

Well, I hear that!

You and @Vendest both are having 1002 errors!? I have to say we've been seeing alot of 1002 errors lately. Maybe it's just luck. I doubt it has to do with the prevalence of bad tokin, but I'm always willing to keep an open mind. Especially since they should become more prevalent as they age and die.

Anyway, because I am trying to pin down this error and characterize the faults that can cause it, I wanted to mention the Oscilloscope, in case one of you had one (go for broke). I'd like to prove the ripple/noise was there before and not after. No worries if you don't have one! But you could still try soldering a 470uF TaPol cap to C2134 and C2135, the unpopulated pads in the upper right of that picture (FBVDDQ). They are the VRAM's bulk filter caps and if there's ripple/noise causing the 1002 there, they should help. One or two low ESR/ESL caps you would be using to replace the tokins with should do. It won't hurt anything and if it works, then maybe SONY's note about VRAM Power is more helpful than previously thought.

I vaguely remember people trying before and it didn't work, but that was before the SYSCON error codes. We have no idea if they had 1002 errors or 3034 (probably the latter, which would explain why). I'm curious now.
Hi, the is the error log pulled from 2 PS3 with YOLD and would like to seek some help on the diagnosis.

I did check against the Syscon error log codes and this forum. CECHA01 is good candidate for NEC/Tokin replacement?

CECHA06 which I replaced the NEC/Tokin (with wrong tantalum cap with high ESR) and still have YOLD after being revived for less than a day. I plan to install the tantalizer with correct cap if error is related.
ERR 00: 00000000 A0093004 FFFFFFFF
ERR 01: 00000000 A0093004 FFFFFFFF
ERR 02: 00000000 A0093004 FFFFFFFF
ERR 03: 00000000 A0093004 FFFFFFFF
ERR 04: 00000000 A0093004 FFFFFFFF
ERR 05: 00000000 A0801001 0B48880E
ERR 06: 00000000 A0801001 0B488772
ERR 07: 00000000 A0801001 0B488735
ERR 08: 00000000 A0801001 0B4886B0
ERR 09: 00000000 A0801001 0B489524
ERR 10: 00000000 A0801002 0B49340E
ERR 11: 00000000 A0801002 0B4933B1
ERR 12: 00000000 A0801002 0B492AEC
ERR 13: 00000000 A0801002 0B492335
ERR 14: 00000000 A0801002 0B4887F0
ERR 15: 00000000 A0801002 0B4886F8
ERR 16: 00000000 A0801002 0B4886B9
ERR 17: 00000000 A0801002 0B488719
ERR 18: 00000000 A0801001 0B4886A8
ERR 19: 00000000 A0071002 29111D9B

CECHA01 with YOLD, haven't attempted any repair.
ERR 00: 00000000 A0101002 FFFFFFFF
ERR 01: 00000000 A0101002 0B48900A
ERR 02: 00000000 A0101002 0B489007
ERR 03: 00000000 A0101002 0B489005
ERR 04: 00000000 A0101002 0B489003
ERR 05: 00000000 A0101002 0B489000
ERR 06: 00000000 A0101002 0B488FFE
ERR 07: 00000000 A0101002 0B488FFB
ERR 08: 00000000 A0101002 0B488FF8
ERR 09: 00000000 A0101002 0B488FF5
ERR 10: 00000000 A0101002 0B488E03
ERR 11: 00000000 A0101002 0B488E01
ERR 12: 00000000 A0101002 0B488DFF
ERR 13: 00000000 A0101002 0B488DFD
ERR 14: 00000000 A0101002 0B488DFA
ERR 15: 00000000 A0101002 0B488DF8
ERR 16: 00000000 A0101002 0B488DF5
ERR 17: 00000000 A0101002 0B488DF2
ERR 18: 00000000 A0101002 0B488DF0
ERR 19: 00000000 A0101002 0B488DED

On 1 of the PS3, 3 pads Diag, Tx, Rx were ripped due to unfortunate mishandling :( and would like to know of alternative points if any, mostly for Rx which I cannot trace.

Thanks

Okay yes, both of those look to be prime candidates for bad tokins. An oscilloscope is needed to confirm bad tokins, but 1002 is know to be caused by bad tokins. It's just that we don't know if it is the only cause. Only one or two consoles with confirmed 1002 errors have been hooked up to an oscilloscope to confirm. And when I was probing PS3#7 (one of those consoles) I didn't think to check the VRAM filter (FBVDDQ). In hindsight that was a missed oppritunity, but I'm learning as I go.

Anyway, that's why I'm curious to try populating C2134 and 2135 with 470uF TaPol caps (VRAM Power filter). If anyone has an oscilloscope and comes across a console with 1002 error, would you please post the noise on FBVDDQ? I may revisit this, if I ever come across a 1002 console myself.

EDIT: Sorry, to answer your other question about alternative RX/TX SYSCON pads, yes there is an unpopulated connector (CN4009) that has pads (RX=Pad 10 and TX=11). Although I'm not sure if SONY sabotaged them or not. Companies tend to do that to connectors (leave a resistor unpopulated or something so you can't just install a connector and start debugging). You can also find southbridge RX/TX on that connector, if you're interested.
 
Last edited:
Got any cheap recommendations.

If you are interested in buying an oscilloscope I recommend the RIGOL DS1054z. Best bank for buck scope there is IMO! But we're still talking $350. There are cheap USB to PC, and handheld oscilloscopes, but they are not good enough to use for this kind of thing. So they are a waste of money.

It is an expensive purchase you will only feel worthwhile if you intend on learning/repairing electronics. It's not going to be they type of investment you want to make for a one off repair, just to get you PS3 running again.
 
Hi, I've learnt a lot from this thread. So I'd like to share a recent story of fixing a DIA-002.

So this board came very cheap as a donor board, I didn't give it much hope. When I did some basic probing for short I found the main 12V was shorted. So literally all the ceramic caps on the main power line are shorted and was initially at a lost. Checked all the fuses and all good. I didn't want to do a syscon error code reading because that would damage the board even more(well the 5V line wasn't shorted though).

So never turned on I tried to probe more around the RSX power line and find a power mosfet, of Vishay Si7386DP (https://www.vishay.com/docs/73108/73108.pdf) has all its pins shorted, including the gate pin. All the other power mosfets, although most of the pins are shorted because of the whole 12V was shorted, but their gate pins are not. This makes me think this innocent looking mosfet is the culprit. So I decided to take it off.

Taking off a power mosfet with a massive copper on its bottom didn't turn out easy, in fact, impossible for my little solder station. But just to prove my probing was right I tried to destroy it with a long nose plier. It turned into a ugly copper sitting on board but then the probing shows no more short. All the initially suspecting caps along the 12V line are no longer shorted.


Then with some more confident I soldered on the syscon wires and tried to read some error code. It turns out this board can turn on properly now!

The error code actually were all 0xa0801001, except one 1200 that's because for the first turn on I didn't expect it works and it doesn't have a heatsink or fan at all...

My understanding is this power mosfet was part of ISL6568(https://www.renesas.com/us/en/document/dst/isl6568-datasheet) voltage regulation IC's circuit. The faulty one happens to be one of the phase power mosfet. Removing one seems good enough for just turning on the PS3. I haven't done decent graphics heavy test before I can jailbreak it and remarry a BD drive.

That's it. I'm quite happy with the knowledge I gained here and some luck I can pin point the faulty component and semi-fix this board.
I imagine that forcing one of the 2-phase buck converters to power the RSX will likely short it out if you push it hard. May not be a great idea, but i'd be curious to see the error code it generates (risky though, so it's up to you). Regardless, a hot air gun, flux and another buck controller should fix that up nicely.

This confirms the hypothesis that 1001 errors implicate the Buck controller(s). I'm betting these controllers are checked in the Power On Sequence at boot (or POST) and if there is a short it will not attempt to power up VDDC (either RSX or Cell). It's clear the SYSCON knows the fault is at the controller, not tokins. Otherwise we'd get a 1002 or 3004. I'm guessing the 3004 is reserved for cases where there is a complete VDDC failure, but where the buck controller is fine.

@sandungas @M4j0r @MrKnowItAll (seriously, that usename hasn't been taken?)...I have been digging through the schematics and attempting to reverse engineer the Power On Sequence (POS). That's a fitting acronym! The double entendre encapsulates my frustration and has been uttered frequently throughout this process!

Case in point, there is a voltage feedback line from the VDDC output to the buck controller. Or rather, there would be if R6237 were populated. It isn't! That should cut the FB line. But why? I don't understand why they don't want a direct feeback line to check the VDDC output is being controlled properly. Or perhaps for phase locking (not sure if there is a PLL or not, I have been down that rabbit hole looking for a potential clue, but it may not be related to VDDC). I do see RSX_VDDMONI splitting off, however I can't find where it goes. It's one of those signals marked with a << (chevron) indicating incoming from another part of the schematic. But I can't seem to find it for the life of me! I wish you could search these references to other parts of the schematic, like you can SMD's...POS! [Someone at SONY is laughing]

Anyway, This is as far as I have gotten to understanding this POS (double entendre intended):

From this video...
...I pulled this crapshow out of my A$$:

General Power On Sequence for NVIDIA based RSX
  1. All external main system voltages (12v, 5v, 3.3v, 1.7v etc.)
  2. RSX_VDDIO for HDMI and AV encoders. Perhaps VDDA is the source for these 1.5v? It would make sence given their placment nearby those VDDIO lines. Not sure.
  3. Then VDDC enable is formed to PWR on core voltage. IOR iP2003 Synchronous 2-Phase Buck Converters --> VDDC. If all goes well…
  4. ...VDDC_PWRGD is formed and acts as the enable signal for 1.8v FBVDDQ and PLL, which powers on GPU memory controller and VRAM. VDDR may also start in this step (Not sure) BD3520 N-Ch MOSFET Driver --> VDDR, YC_RC_VDDIO (FlexIO SPI).

Approximate Boot process (prior to FW 3.60) according to Rodrigo Copetti here:

I'm quoting most of this, but inserted my questions, notes, and retarded babbling's in red. Please be kind, my feeble attempts to understand all this is raw...
  1. Syscon powers on and executes instructions from its internal ROM. It then sends a 'Configuration Ring' to Cell via SPI which initializes Cell and deactivates the eighth SPU. Finally, it latches the power line and gives life to Cell.
    • Doesn't cell need power to be initialized and deactivate SPU 8? So how can this occur before PWR is latched? I assume he's talking about VDDC, main core voltage.
  2. Cell's PPU reset vector points to its hidden ROM, which stores the routines to locate and decrypt bootldr from Flash. The decrypted piece is then loaded by the first SPU in isolation mode.
  3. The now-isolated SPU, having loaded bootldr, initializes part of the hardware (XDR memory and I/O interfaces) and decrypts a binary named lv0 and instructs the PPU to run it.
    • So XDR_VDD and FlexIO (BE_VDDA & YC_RC_VDDIO?) are "initialized," whatever that means. Does it mean they are PWR'd on in this step or are initialized in software having already been pwr'd on in a previous step? The SYSCON's bringup log has the power sequence finish before the bootloader starts, so either the syscon's bringup log is referring to a different bootloader or the POS must have finished before the bootloader could load into the SPU. Or am I wrong that the POS is finished by the time the bootloader starts? This is confusing!
  4. The PPU, now executing lv0, decrypts metldr (a console-specific loader) and sends it to the third SPU, again in isolation mode.
  5. The SPU2, now executing metldr, executes five more loaders sequentially:
    1. lvl1dr decrypts and loads lv1, which contains the Hypervisor that takes over the first privilege level. Moreover, lv1 sets up the hard drive, Blu-ray drive and RSX.
      • Again, my understanding is that all of these peripherals should have been power on already by now. So when he say's initialized, he mush mean in software?
    2. lv2ldr decrypts and loads lv2, which contains the kernel and runs on top of the hypervisor. It also finishes initialising RSX, the PS2 emulation, Bluetooth, USB controller and the Multi-card reader.
      • So at this point the RSX Power On Sequence should have completed. The question is at what point does it begin? Again, i think I'm not understanding what he means by "initialized." To me that mean's powered on, but that doesn't make sense in this context.
    3. appldr decrypts and loads vsh (the Visual Shell) and other dependencies. vsh will later enable the user to load a game.
    4. isoldr decrypts and loads modules that will run in the third SPU in isolation module. These modules are critical for security and perform many cryptographic functions throughout the console's lifecycle. Consequently, the third SPU is reserved for security functions and games can't use it (leaving only six SPEs for games).
    5. The PPU, having loaded vsh, grants the user control through a graphical user interface, which manifests itself with an iconic orchestral splash sound followed by the XMB menu.

What I'm hoping to quantify is what's happening at each step of the POS:
Code:
[SSM] state: 0000 -> 0101
Bringup Mode #0 (0xFF)
[SSM] ssmCb_OnStartingBePowOn() called.
[SSM] First Boot.
[SSM] Bringup mode : syspm_stat=00000000/00000000
[POWSEQ] PowerSeq_Setup called.
[SSM] state: 0101 -> 0201
[POWSEQ] AV Backend Setup
[SSM] state: 0201 -> 0102
[SSM] state: 0102 -> 0202
[SSM] state: 0202 -> 0103
[SSM] state: 0103 -> 0203
[SSM] ssmCb_BeforeBeOn() called.
[SSM] state: 0203 -> 0104
Psbd_SbTransMode_Half:0x20e2
[SSM] state: 0104 -> 0204
[SSM] state: 0204 -> 0105
[SSM] state: 0105 -> 0400
(PowerOn State)
[SERV NVS] READ CMD
For example, does...
[SSM] state: 0000 -> 0101
...corespond to step one in Rodrigo Copetti's Boot process?

Knowing what each step does, will help figure out went went wrong when the bringup dialog shows an error during that step. I have been hoping we can figure them out for quite awhile now, for this reason.
 
Last edited:
[SSM] state: 0201 -> 0102 [SSM] state: 0102 -> 0202 [SSM] state: 0202 -> 0103 [SSM] state: 0103 -> 0203 [SSM] ssmCb_BeforeBeOn() called. [SSM] state: 0203 -> 0104 Psbd_SbTransMode_Half:0x20e2 [SSM] state: 0104 -> 0204 [SSM] state: 0204 -> 0105 [SSM] state: 0105 -> 0400

I've spent a lot of time wondering about these, too... I used to think these steps in particular were sequential, but they're not! I mean, you go from 0202 to 0103. I wonder if the numbers correspond to different subsystems, and the numbers indicate their ID.

We might even have an idea of what they represent given the names of the two functions called here, ssmCb_OnStartingBePowOn() and ssmCb_BeforeBeOn(). Those two indicate that stuff is running before Cell is "on" (whatever that means), but goes along with the rest of the logical sequence Rodrigo Copetti details.

I'm assuming [ssm] stands for syscon, but do we know what "cb" is?
 
I've spent a lot of time wondering about these, too... I used to think these steps in particular were sequential, but they're not! I mean, you go from 0202 to 0103. I wonder if the numbers correspond to different subsystems, and the numbers indicate their ID.

We might even have an idea of what they represent given the names of the two functions called here, ssmCb_OnStartingBePowOn() and ssmCb_BeforeBeOn(). Those two indicate that stuff is running before Cell is "on" (whatever that means), but goes along with the rest of the logical sequence Rodrigo Copetti details.

I'm assuming [ssm] stands for syscon, but do we know what "cb" is?
They do tho.

Each one enumerates forwards, even though they jump back and forth from 100 & 200 levels. Not sure what the levels mean, but I think of it like college classes. Like 100 series (101, 102, 103, etc) and 200 series (201, 202, 203, etc). It might indicate some kind of hierarchy. Or maybe there is meaning to the Hundredths place, like 1xx is CPU and 2xx is RSX...IDK. I really am curious.

I have been wondering if there is a way to use my oscilloscope to record these events and correspond them to the time each voltage started. I guess the first step would be to probe each voltage and measure the time from boot until it is enabled. In that way piece together what voltages appear and in what order. Then hope it lines up with those events.
 
Hello, everyone!

Maybe someone will be able to advice next steps to diagnose my PS3 issue?
I have: CECHC PS3 (COK-002 board) that one day just YLODed and never turned on again. It now has an instant YLOD - after being powered on it shutdowns in less than a second.
I inspected the board and I saw no immediate issues (other than 2 RSX replaced NEC/Tokin caps, but those were there before).

I got some logs from SYSCON, mainly 1004 error, which seems AC/DC power failure, no other clues. (3001 error fires up when I start console without 12v rail connected, so I guess, it can be ignored?)

Here are some logs:
> bringup
bringup
[SSM] state: 0000 -> 0101
Bringup Mode #0 (0xFF)
[SSM] ssmCb_OnStartingBePowOn() called.
[SSM] First Boot.
[SSM] Bringup mode : syspm_stat=00000000/00000000
[POWSEQ] PowerSeq_Setup called.
[SSM] fatalreq delayed.
[ERROR]: 0xa0081004
[SSM] state: 0101 -> 0201
[POWSEQ] AV Backend Setup
[SSM] *** Power Fail ***
[SSM] state: 0201 -> 0700
[POWSEQ] AV Backend Letup
[SSM] Shutdown mode : syspm_stat=00000000/00000000

> errlog
[POWSEQ] PowerSeq_Letup called.
[SSM] state: 0700 -> 0600
(PowerOff State) (Fatal)
errlog
ofst[ 96]:err_code:0xffffffff, clock:0xffffffff
ofst[100]:err_code:0xa0081004, clock:0xffffffff
ofst[104]:err_code:0xa0081004, clock:0xffffffff
ofst[108]:err_code:0xa0081004, clock:0xffffffff
ofst[112]:err_code:0xa0081004, clock:0xffffffff
ofst[116]:err_code:0xa0081004, clock:0xffffffff
ofst[120]:err_code:0xa0081004, clock:0xffffffff
ofst[124]:err_code:0xa0081004, clock:0xffffffff
ofst[ 0]:err_code:0xa0081004, clock:0xffffffff
ofst[ 4]:err_code:0xa0081004, clock:0xffffffff
ofst[ 8]:err_code:0xa0081004, clock:0xffffffff
ofst[ 12]:err_code:0xa0081004, clock:0xffffffff
ofst[ 16]:err_code:0xa0081004, clock:0xffffffff
ofst[ 20]:err_code:0xa0081004, clock:0xffffffff
ofst[ 24]:err_code:0xa0081004, clock:0xffffffff
ofst[ 28]:err_code:0xa0081004, clock:0xffffffff
ofst[ 32]:err_code:0xa0081004, clock:0xffffffff
ofst[ 36]:err_code:0xa0081004, clock:0xffffffff
ofst[ 40]:err_code:0xa0081004, clock:0xffffffff
ofst[ 44]:err_code:0xa0081004, clock:0xffffffff
ofst[ 48]:err_code:0xa0003001, clock:0xffffffff
ofst[ 52]:err_code:0xa0003001, clock:0xffffffff
ofst[ 56]:err_code:0xa0003001, clock:0xffffffff
ofst[ 60]:err_code:0xa0003001, clock:0xffffffff
ofst[ 64]:err_code:0xa0003001, clock:0xffffffff
ofst[ 68]:err_code:0xa0003001, clock:0xffffffff
ofst[ 72]:err_code:0xa0081004, clock:0xffffffff
ofst[ 76]:err_code:0xa0081004, clock:0xffffffff
ofst[ 80]:err_code:0xa0081004, clock:0xffffffff
ofst[ 84]:err_code:0xa0081004, clock:0xffffffff
ofst[ 88]:err_code:0xa0081004, clock:0xffffffff
ofst[ 92]:err_code:0xa0081004, clock:0xffffffff
[mullion]$
Not sure how to measure test points, because the console shuts off too quickly.
Any advice will be much appreciated.

Thanks
Do you have another power supply to test?

1004 has not been reported much. I had one once while testing, but it occurred while turning the console off (hard shutdown). We think they are mostly meaningless, but your's is occuring alot. So I'm suspecting the PSU.
 
@M4j0r explained this to me before in the frankenstein thread. I thought that VRAM was the voltage to the RSXRAM modules and that it is supplied by VDDR. But its actually supplied by the 1.8V_FBVDDQ and 1.2V_VDDR is supplying the voltage for SPI communication across the FlexIO. That makes sense.
I've also added the voltage descriptions here: https://www.psdevwiki.com/ps3/index.php?title=File:RSX_SKEMA.jpg&diff=prev&oldid=62463 .

Sony provides a bit more information about the PS3 system hardware in the "Sony BCU-100 Maintenance Manual". The BCU-100 is the Sony Zego Unit which uses the BE-28 board. The BE-28 board in based on the TMU-520 which is based on the COOKIE-02. So they're all related. You can find the relevant information on the following pages:
Code:
Page
19      Fuses (BE-28 only)
20-24   Component Descriptions/Voltages/Clocks (BE-28 only)
60-95   Component List (BE-28 only)
123     Overall Block Diagram
124     Clock Diagram
125     I²C Diagram
126     Reset Diagram
127     Power Supply Diagram
132-183 Schematics
214-215 Board Layout
216-219 Additional Passive Component List
@sandungas @M4j0r @MrKnowItAll (seriously, that usename hasn't been taken?)...I have been digging through the schematics and attempting to reverse engineer the Power On Sequence (POS). That's a fitting acronym! The double entendre encapsulates my frustration and has been uttered frequently throughout this process!
The four digit SSM states aren't related to the Power Sequence Step.
This is a simplified overview over the Cytology boot sequence steps:
Code:
00-11 Power Sequence Init, Basic Voltage/Clock Init...
20-22 CELL Init
23    Check for RSX/SB/CP...
30-32 More CELL Init
40    Bit Training
50    RSX Init
51-52 SB Init
60-62 More CELL Init
FF    Finish
lv0ldr loading probably starts with step 60 and it gets executed after step FF (=80).
You can see that here: https://pastebin.com/BaCFugAU .
 
Do you have another power supply to test?

1004 has not been reported much. I had one once while testing, but it occurred while turning the console off (hard shutdown). We think they are mostly meaningless, but your's is occuring alot. So I'm suspecting the PSU.
Thanks for the reply! Unfortunately, I don't have spare PSU, so can't rule it completely out. I've checked its voltages though and everything seems to be normal: 5V and 12V are there, PSU turns on normally. It also looks OK on the inside - no leaked capacitors or anything.

I gave it another round today and checked the fuses - all seem to be ok. Also, the resistances between +/- NEC/TOKINs are 2.2 Ohms and 5.4 Ohms. From what I read here - also seems to be OK.

One thing on the board bothers me, though: this chip on the picture, between PINs 1 and 2 has resistance about 20-24 Ohms, which seems to be quite low value or is it within the acceptable range?

https://imgur.com/a/VybkGAZ
 
Thanks for the reply! Unfortunately, I don't have spare PSU, so can't rule it completely out. I've checked its voltages though and everything seems to be normal: 5V and 12V are there, PSU turns on normally. It also looks OK on the inside - no leaked capacitors or anything.

I gave it another round today and checked the fuses - all seem to be ok. Also, the resistances between +/- NEC/TOKINs are 2.2 Ohms and 5.4 Ohms. From what I read here - also seems to be OK.

One thing on the board bothers me, though: this chip on the picture, between PINs 1 and 2 has resistance about 20-24 Ohms, which seems to be quite low value or is it within the acceptable range?

https://imgur.com/a/VybkGAZ
Hmm, interesting. Well, like I said there haven't been many reports of 1004 errors. So I, for one, am eager to learn what you find.

Tokin resistance is good as you have already confirmed (I can tell you've been researching. I commend you sir).

Pin 1 on IC2408 is Vout and pin 2 is NC/GND.
MM1561JFBE-SOP7B.png
There should be separation between them! C2461, R2454, R2427 & R2458 separates them and could be short, but you can't always tell without a comparison. So take this with the grain of salt, but Pin 1/2 are separated by 82KOhms a scrap COK-001 I just measured (it's missing the RSX). You may very well have correctly found an issue.

3.3v_Misc --> 1.8v_Analog converter should not have anything to do with the RSX, so I think it's okay on my board. And that makes me suspect yours is not correct. Now, applying what I hope I'm learning about the Power On sequence, one of the first step should be to power on the main system voltages (Such as this 3.3v --> 1.8v converter). The step number of 08 1004 is early, as I would expect based on the POS hypothesis that I'm currently working from. Another line of evidence supporting this is in your bringup log where it says the error occured in the "AV backend." So I think you are on the right track and am curious to see if you can figure this out. 1004 is an error we have not pinned down yet!
 
Thanks for the reply! Unfortunately, I don't have spare PSU, so can't rule it completely out. I've checked its voltages though and everything seems to be normal: 5V and 12V are there, PSU turns on normally. It also looks OK on the inside - no leaked capacitors or anything.

I gave it another round today and checked the fuses - all seem to be ok. Also, the resistances between +/- NEC/TOKINs are 2.2 Ohms and 5.4 Ohms. From what I read here - also seems to be OK.

One thing on the board bothers me, though: this chip on the picture, between PINs 1 and 2 has resistance about 20-24 Ohms, which seems to be quite low value or is it within the acceptable range?

https://imgur.com/a/VybkGAZ
Can I see photos with you top board? I would like to paint some of them and take me some measurements. I may find possible defects but need close looking to each board/per situation. Now here we have something related in nearly rsx or syscon missing voltage.
 
Hmm, interesting. Well, like I said there haven't been many reports of 1004 errors. So I, for one, am eager to learn what you find.

Tokin resistance is good as you have already confirmed (I can tell you've been researching. I commend you sir).

Pin 1 on IC2408 is Vout and pin 2 is NC/GND.

There should be separation between them! C2461, R2454, R2427 & R2458 separates them and could be short, but you can't always tell without a comparison. So take this with the grain of salt, but Pin 1/2 are separated by 82KOhms a scrap COK-001 I just measured (it's missing the RSX). You may very well have correctly found an issue.

3.3v_Misc --> 1.8v_Analog converter should not have anything to do with the RSX, so I think it's okay on my board. And that makes me suspect yours is not correct. Now, applying what I hope I'm learning about the Power On sequence, one of the first step should be to power on the main system voltages (Such as this 3.3v --> 1.8v converter). The step number of 08 1004 is early, as I would expect based on the POS hypothesis that I'm currently working from. Another line of evidence supporting this is in your bringup log where it says the error occured in the "AV backend." So I think you are on the right track and am curious to see if you can figure this out. 1004 is an error we have not pinned down yet!

Thanks for your input! You might as well be correct, true. Too bad I don't have any other board to compare measurements to and I haven't found any supplier to order a replacement chip from, so I guess, I'll have to look for an inexpensive PS3 in working condition as a source of reliable measurements and will keep researching this board in the meanwhile.

Can I see photos with you top board? I would like to paint some of them and take me some measurements. I may find possible defects but need close looking to each board/per situation. Now here we have something related in nearly rsx or syscon missing voltage.

Sure! Here are the photos of the board (top and bottom): Let me know if you need larger resolution images.
https://imgur.com/a/G8hQeEj


UPD: I've noticed that probes for my multimeter got bad (3-4Ohm self resistance), so I replaced them today and measured NEC/Tokins resistance again and got only 1.6-2 Ohm, which I'm not sure if it is enough.
I also, couldn't get characteristics of yellow tantallum caps, so they might have high ESR - I've ordered low-ESR replacement, but it'll take a while to arrive and I'm not sure if 1004 has anything to do with NEC/Tokins - I'd guess they start impacting boot process at later Power On stages.

Best regards,
Pavel
 

Similar threads

Back
Top