HomeAboutBlog ContactSAPHTML SQLPython AI
🖥️ SAP Basis Monitoring · Step by Step · 2026–27

How SAP Systems Are Monitored
Step-by-Step Complete Guide for Every Consultant and Beginners

2026–27 Guide Pramod Behera 24 min read SAP Basis · Performance · Security · S/4HANA

🖥️ Why SAP System Monitoring Matters

In SAP system is not just a bit of software- this is central worried system of an entire organisation. When it works well, thousands of users can process orders, post invoices, run payrolls, and manufacture products without Conflict.When it develops a problem a locked work process, a runaway database query, a failed background job, or a security breach the entire business can grind to a halt within minutes. SAP system monitoring is the daily discipline of watching the system's health indicators, catching problems before they become outages, and optimising performance so every user gets the fastest possible experience. Also you are a junior SAP Basis administrator just starting out or an experienced consultant preparing for an interview, understanding exactly how SAP systems are monitored- which tools to use, in which order, and what to look for is one of the most essential practical skills in the entire SAP ecosystem.This tutorial or document breaks down the process step by step, using simple language and real-world examples to help you master the skill.

Performance Monitoring

Work processes, memory, response times, database calls- every millisecond matters for 5,000 concurrent SAP users.

🗄️

Database Health

SAP sits on HANA, Oracle, or SQL Server database performance directly determines overall system speed.

🔒

Security & Compliance

Failed logons, unauthorised access, critical authorisation changes all monitored in real time via SM20 and SUIM.

System Performance
Work Process Monitoring
Database Health
ABAP Dumps
Background Jobs
Security Alerts
Batch Input & Queues
S/4HANA Monitoring

🌅 The SAP Basis Morning Health Check - Step by Step

Every SAP Basis administrator at a professional organisation follows a structured morning health check routine- a sequence of transactions executed in a specific order to confirm that the system survived the night successfully and is ready for the business day. This is not optional in enterprise SAP: a system that developed a problem at 2 AM needs to be caught at 7 AM before 3,000 users log in at 8 AM. The entire check takes 20–30 minutes when done efficiently.

Imagine the Basis team at Pramod Manufacturing Systems Ltd in Pune - a company with SAP S/4HANA running on HANA database, 4,200 active users across three plants. The Basis administrator, Pooja Patil, arrives at 6:00 AM every day and follows this exact sequence before the production shift begins.

MONITORING 01

Pramod Manufacturing Systems Ltd- Basis Morning Health Check

1
Check System Availability- SM51 (Application Server Overview)

The very first transaction Pooja opens every morning is SM51. This shows all application servers in the SAP landscape- their status (active/inactive), type (dialog, background, message, enqueue), and start time. If any server shows as inactive or recently restarted at an unexpected time, this is the first red flag of the day. In S/4HANA, the system manager checks the HANA cockpit for the database instance status simultaneously. At Pramod Manufacturing, SM51 shows 3 application servers- all green, all started on Sunday at 22:00 during the weekly restart window.

T-Code: SM51 · Shows: all application servers, their type and status · Also check: HANA Cockpit (S/4)
2
Check Work Processes- SM50 (Local) and SM66 (Global)

SM50 shows all work processes on the current application server: dialog (DIA), background (BGD), update (UPD), enqueue (ENQ), spool (SPO), and message (MSG). The critical thing to look for is work processes stuck in Running status for a long time- especially any that show a suspiciously long elapsed time (more than 5 minutes for a dialog process is a problem). SM66 shows work processes across ALL application servers at once- the global view. At Pramod Manufacturing, Pooja spots one DIA work process on server APP02 showing "Running" for 47 minutes with program RSNAST00- a potential runaway process requiring investigation.

T-Code: SM50 (local WPs) · SM66 (global WPs) · Look for: Running {'>'} 5 min (DIA), stuck UPD processes
3
Check Active Users and Sessions- AL08 / SM04

AL08 shows a consolidated count of all logged-in users across all application servers globally. SM04 shows individual user sessions on the local server including their current transaction and how long they have been in it. Pooja checks AL08 at 7:05 AM- 47 users logged in (expected: some overnight batch operators and global users). If AL08 shows 3,000 users at 7 AM when the shift starts at 8 AM, something is wrong- perhaps the night-shift batch logon was never terminated, or a third-party interface is creating ghost sessions.

T-Code: AL08 (all servers, global user count) · SM04 (local sessions with transaction detail)
4
Check ABAP Runtime Errors (Dumps)- ST22

ST22 is the ABAP dump browser- it shows all short dumps (runtime errors) that occurred in the system. Every ABAP dump means a user's transaction terminated unexpectedly or a background job crashed with an error. Pooja filters ST22 for the last 24 hours and sees 3 dumps overnight: two TYPE_REF_DATA_TYPE errors in a custom programme and one DBIF_RSQL_SQL_ERROR in a standard MM job. The DBIF_RSQL_SQL_ERROR- a database error- is the most serious and is escalated to the DBA team immediately.

T-Code: ST22 · Filter: last 24h · Priority: DBIF (database errors) {'>'} MEMORY errors {'>'} program errors
5
Check Background Job Status- SM37

SM37 is the background job monitor. Every SAP system runs dozens to hundreds of background jobs overnight- payroll, MRP, period-end postings, interface programmes, data archiving. Pooja filters SM37 for jobs that ran overnight and looks specifically for any with status Cancelled or Aborted. At Pramod Manufacturing this morning: payroll job PC00_M40_CALC ran at 01:30- Status: FINISHED ✓. MRP job MD01N ran at 03:00- Status: CANCELLED ✗. This is a critical failure- MRP did not run, meaning today's production planning is based on yesterday's data. Pooja immediately restarts the MRP job and calls the PP team.

T-Code: SM37 · Filter: yesterday + today, all users, Cancelled/Aborted status · Action: investigate + restart
6
Check System Log- SM21

SM21 is the SAP system log- a chronological record of all system-level events: work process restarts, gateway errors, ICM (Internet Communication Manager) errors, enqueue server issues, and database connection failures. Unlike ST22 (which shows ABAP errors) and SM37 (which shows job failures), SM21 shows the infrastructure-level events. Pooja filters SM21 for the last 12 hours and finds 3 "Work process restarted" messages at 03:14 AM corresponding to the MRP job cancellation- confirming the job crashed due to a work process failure, not a programme error.

T-Code: SM21 · Filter: last 12h, Error/Warning messages · Look for: WP restarts, DB connection errors, ICM errors
7
Check Update Records- SM13

SM13 shows the status of all asynchronous update requests- when a user posts a document in SAP, the actual database write often happens asynchronously through the update work process. SM13 shows any update requests that are still in queue or have failed. Failed updates (shown in red with status V1 or V2) mean data was entered by a user but never actually written to the database- a very serious data integrity issue. At Pramod Manufacturing, SM13 is clean- no pending or failed updates. Pooja marks this as green in the morning checklist.

T-Code: SM13 · Look for: update records with status ERR or INIT (failed/pending) · Action: investigate and re-execute
8
Check Spool and Print Queue- SP01 / SP12

SP01 shows all print spool requests. An accumulation of unprocessed or failed print requests can indicate a printer configuration problem. More critically, SP12 shows the total size of the spool database- if spool is not regularly archived it can fill up the database. At Pramod Manufacturing, SP01 shows 1,200 spool requests from yesterday's payslip run- all printed successfully. SP12 shows spool at 45 GB of a 100 GB allocation- within acceptable limits. No action needed today.

T-Code: SP01 (spool requests) · SP12 (spool database size) · Alert if SP12 {'>'} 80% of allocation
Morning Health Check- Complete Checklist with T-Codes and What to Look For
#CheckT-CodeGreen (OK)Red (Action Required)
1Application ServersSM51All servers active, normal start timeAny server inactive or unexpected restart time
2Work Processes (local)SM50All WPs in Wait or running normallyDIA WP running {'>'} 5 min, stuck UPD processes
3Work Processes (global)SM66Consistent across all app serversServer with all DIA WPs occupied = bottleneck
4Active User CountAL08Expected count for time of dayGhost sessions, unexpectedly high count
5ABAP DumpsST22Zero or only known/expected dumpsDBIF errors, recurring dumps same programme
6Background JobsSM37All critical jobs FINISHEDAny critical job CANCELLED or ABORTED
7System LogSM21No errors or warningsWP restarts, DB failures, ICM errors
8Update RecordsSM13No failed or pending updatesAny ERR or INIT status records
9Spool StatusSP01 / SP12Jobs printed, spool {'<'} 80% fullFailed prints, spool database nearly full
10Lock EntriesSM12Normal lock count, no stale locksOld lock entries (BEGDA yesterday) = system crash residue
💡

Real Basis Interview Tip: When asked "how do you start your day as a Basis admin?", always mention SM51 → SM66 → ST22 → SM37 → SM21 in that order. This sequence- servers → work processes → dumps → jobs → system log- shows you understand the priority hierarchy: infrastructure first, then application layer, then batch jobs. Interviewers value this structured thinking over random T-code recitation.

⚡ SAP Performance Monitoring- Finding the Bottleneck

Performance monitoring in SAP is a systematic process of identifying where the system is slow- whether the problem lies in the network, the SAP application layer, the database, or in inefficient ABAP code. SAP provides a set of powerful workload analysis tools that capture response time statistics for every transaction executed. The key principle of SAP performance monitoring is: always start at the highest level and drill down- from overall system workload statistics to individual programme analysis to specific database SQL statements.

At Pramod Global Retail Ltd in Mumbai, the production SAP system starts slowing down every day between 10:30 AM and 12:00 PM. Users report that transaction VA01 (Sales Order creation) is taking 8–12 seconds per save instead of the normal 1–2 seconds. The Basis and ABAP teams must identify the root cause using the performance monitoring toolchain.

MONITORING 02

Pramod Global Retail Ltd- Transaction VA01 Slowness Investigation

1
Overall Workload Statistics- ST05 / SM66 / AL08

The Basis admin opens SM66 at 10:45 AM and sees that all 60 dialog work processes on the three application servers are occupied- 100% utilisation. This confirms the system is CPU/work-process saturated during this window. AL08 shows 4,800 active users (unusually high- normal peak is 3,500). The combination of 100% WP utilisation and unexpectedly high user count narrows the cause to either a sudden spike in users or individual sessions consuming too many work processes too slowly.

T-Code: SM66 (WP utilisation) · AL08 (user count) · Immediate read: is the system saturated at infrastructure level?
2
Workload Analysis- ST05 / Workload Monitor (Transaction ST)

The Basis admin opens the Workload Monitor (Transaction ST) and navigates to Transaction Profile for the last 2 hours. This shows average response times by transaction. VA01 shows average response time of 9,840 ms- split into: Application server time 340 ms (normal), Database time 9,500 ms (extremely abnormal- should be {'<'} 500 ms). The database is consuming 96% of VA01's response time. This is the key finding: database bottleneck, not application server. The investigation now shifts to the database layer.

T-Code: ST (Workload Monitor) → Transaction Profile · Read: DB time vs App time split in response breakdown
3
SQL Trace to Find the Slow Query- ST05

The ABAP developer activates ST05 (SQL Trace) and reproduces the slow VA01 save. ST05 captures every database call made during the trace session with execution times. The trace results show one SELECT statement consuming 8,900 ms- a SELECT on table VBAP with no index hit (Full Table Scan). The SQL: SELECT * FROM VBAP WHERE KUNNR = 'C-10001' AND ERDAT = '20250415'. Table VBAP has 48 million rows- a full scan of this table for every sales order save is the root cause of the 8-second slowdown.

T-Code: ST05 (SQL Trace) · Activate → reproduce → deactivate → analyse · Look for: Full Table Scan, high runtime SELECTs
4
Identify the Missing Index- SE11 / DB02

The ABAP developer opens SE11 for table VBAP and checks its indexes. Standard SAP index VBAP~0 covers MANDT + VBELN (primary key). Custom index VBAP~Z01 covers MATNR + WERKS (for material-plant queries). But there is no index on KUNNR (customer)- the field being used in the problematic SELECT. DB02 (Database Performance Monitor) confirms the absence of an index on KUNNR and shows this query has been running 12,000 times today with zero cache hits. The fix: create a secondary index on VBAP.KUNNR + VBAP.ERDAT via SE11 and activate it- or review the custom ABAP code to change the WHERE clause to use an indexed field.

T-Code: SE11 (table indexes) · DB02 (database performance) · Fix: secondary index or ABAP WHERE clause optimisation
5
Memory and Buffer Analysis- ST02

ST02 (Tune Summary) shows the memory and buffer utilisation of the SAP application server- program buffer, table buffer, CUA buffer, field description buffer. A Swap (buffer overflow) count greater than 0 indicates the buffer is too small and data is being reloaded from the database repeatedly. At Pramod Global Retail, ST02 shows the Nametab buffer (which stores table structure definitions) has Swap count = 14,500- meaning table structure lookups are going to the database 14,500 times instead of being served from memory. Buffer size increase via SAP profile parameter zcsa/table_buffer_area is recommended.

T-Code: ST02 (Tune Summary) · Look for: Swap {'>'} 0 in any buffer row = buffer too small · Fix: profile parameter increase
6
HANA Database Performance- HANA Studio / HANA Cockpit

For S/4HANA on HANA database, performance investigation goes deeper into the HANA cockpit. The HANA Performance Analysis view shows: CPU usage per HANA service, memory consumption, thread states (Running / Waiting / Suspended), and the Expensive Statements Trace which captures the top 100 slowest SQL statements. At Pramod Global Retail, the HANA cockpit confirms the VBAP full table scan- it appears as the #1 most expensive statement consuming 34% of total HANA CPU. Analysing the EXPLAIN PLAN in HANA studio confirms no index is used- consistent with the ST05 finding from the application layer.

HANA Cockpit → Performance → Expensive Statements Trace · Also: HANA Studio → SQL Editor → EXPLAIN PLAN

SAP Performance Investigation Golden Rule: Always decompose response time using this hierarchy- Total = DB time + App time + Network time. If DB time dominates: look at ST05 (missing index, full table scan). If App time dominates: look at ST12 (ABAP trace- program logic inefficiency). If Network time dominates: look at SMICM (ICM configuration, network topology). Every performance investigation follows this decomposition- never guess where the bottleneck is before measuring it.

🔒 SAP Security Monitoring- Detecting Threats in Real Time

SAP security monitoring is the continuous surveillance of user activity, authorisation violations, and configuration changes in the SAP system. Because SAP contains the most sensitive business data in any organisation- financial records, payroll, pricing, strategic plans- a security breach in SAP can be catastrophic. SAP provides the Security Audit Log (SM20) as its primary security monitoring tool, supplemented by user information reports in SUIM and transaction analysis in ST01. Real-time monitoring of SM20 alerts is a compliance requirement under SOX, GDPR, and IS0 27001.

At Pramod Financial Services Pvt Ltd in Chennai, the SAP security team conducts a weekly security review using the SAP Security Audit Log and related tools. This week's review has flagged several anomalies requiring investigation.

MONITORING 03

Pramod Financial Services Pvt Ltd- Weekly SAP Security Review

1
Security Audit Log- SM20 (Failed Logon Monitoring)

The security admin opens SM20 and filters for event class AU (User changes) and failed logon attempts for the past 7 days. Result: user VENDOR.PORTAL has 847 failed logon attempts over 3 days- classic brute-force attack pattern. The IP address making all attempts is 103.45.xx.xx- an external IP not in the company's IP whitelist. The account is immediately locked using SU01 and the IT security team is notified. The failed attempts are preserved permanently in SM20 as audit evidence- cannot be deleted by any user.

T-Code: SM20 → filter: event class AU, Failed Logons · Look for: repeated failures from same IP, off-hours logons
2
Check Users with Critical Authorisations- SUIM

SUIM (User Information System) is the SAP authorisation reporting tool. The security team runs the report "Users with Critical Combinations of Authorisations"- the classic Segregation of Duties check. At Pramod Financial Services: 3 users have both F_BKPF_BUK (post FI documents) AND F_BKPF_KOA (post to vendor accounts) in the same profile- meaning they can create and post vendor invoices without approval. This SoD conflict should not exist. Two of the three users are AP clerks who should not have posting authority- their profiles need correction via role redesign.

T-Code: SUIM → Users → By Authorization Values · Critical combos: F_BKPF_BUK + F_BKPF_KOA, S_TCODE + SE38 (execute any ABAP)
3
Monitor Users with SAP_ALL or Debugging Authority- SUIM

SAP_ALL is the most powerful profile in SAP- a user with SAP_ALL can do anything in the system including deleting data, changing financial postings, and bypassing all controls. SUIM → Users by Profile → SAP_ALL should show only the SAP DDIC and emergency break-glass accounts- never regular business users. At Pramod Financial Services, SUIM reveals 4 users with SAP_ALL: 2 are correct (DDIC, emergency), 1 is an ABAP developer who was given SAP_ALL "temporarily" 18 months ago, and 1 is a recently resigned employee whose account was not locked. Both are corrected immediately.

T-Code: SUIM → Users by Profile (SAP_ALL) · Rule: SAP_ALL only for DDIC + maximum 1 emergency break-glass user
4
Check Inactive / Locked Users- SU10 / SUIM

Dormant accounts- active SAP user IDs that have not logged in for 90+ days- are a major security risk. A hacker who discovers these credentials can use them without the original user knowing. SUIM → Users who have not logged on since shows all such dormant accounts. At Pramod Financial Services: 23 users have not logged in for more than 90 days. Cross-checking with HR PA0001 (infotype 0001 ENDDA {'<'} today) reveals 8 of these are ex-employees- accounts were never locked after resignation. All 8 are locked immediately via SU10 (mass user management).

T-Code: SUIM → Inactive Users / SU10 (mass lock) · Policy: auto-lock after 60 days inactivity + daily HR-SAP sync
5
Monitor Sensitive Transaction Usage- SM20 / STAD

Certain SAP transactions are extremely sensitive and should only be used by very specific people: SE38/SE80 (ABAP editor- can change programmes), SM49/SM69 (execute OS commands), STRUST (SSL certificates), SE16N with edit capability (change any table directly). SM20 can be configured to log every execution of these transactions. STAD (Statistical Records) also records every transaction executed by every user. At Pramod Financial Services, SM20 shows user ABAP.DEV003 executed SE16N with edit mode on table BSEG (FI line items)- a direct table edit on financial data. Immediate security escalation.

T-Code: SM20 (sensitive T-code execution) · STAD (statistical records per user) · RSUSR200 (user activity report)

Key Security Monitoring Principle: SAP security monitoring must be continuous, not periodic. Companies that review SM20 only during audits will miss breaches for weeks. Best practice: configure SM20 to write to a SIEM system (Security Information and Event Management) in real time, with automated alerts for: (1) 5+ failed logons from same user, (2) any logon at 2–5 AM, (3) any use of SE38/SE16N edit mode in production, (4) any new user added to SAP_ALL. These 4 automated alerts catch 80% of all SAP security incidents.

🗄️ SAP Database Monitoring- Keeping the Engine Running

The SAP database is the heart of the entire system- every business transaction ultimately writes to and reads from the database. Database monitoring in SAP covers three areas: space management (is the database running out of room?), performance (are queries running efficiently?), and backup and recovery (can we restore the system if something goes wrong?). The tools differ slightly between HANA (S/4HANA) and traditional databases (Oracle, SQL Server, DB2), but the monitoring philosophy is identical.

MONITORING 04

Pramod Steel Industries Ltd- Database Space and Backup Monitoring

1
Database Size and Space- DB02

DB02 (Database Performance Monitor) is the starting point for all database space monitoring. It shows: total database size, used space, free space, largest tables, tablespace utilisation (for Oracle), and space growth trend. At Pramod Steel, DB02 shows the database is 2.8 TB of a 3.5 TB allocation- 80% full. The growth trend shows 12 GB per day. This means the database will be full in approximately 58 days. Pooja raises an urgent ticket with the infrastructure team for storage expansion and simultaneously reviews the data archiving backlog- large tables like MSEG (goods movements) and BSEG (FI line items) are the top consumers.

T-Code: DB02 (space overview) · Alert threshold: 80% used = amber, 90% used = red · Also: TAANA for table analysis
2
Check Database Backup Status- DB13

DB13 is the DBA Planning Calendar- it shows the schedule and status of all database backup jobs. At Pramod Steel, the backup schedule is: Full backup every Sunday at 22:00, Incremental backup daily at 01:00, Archive log backup every 30 minutes. DB13 shows this Tuesday's incremental backup at 01:00 has status FAILED- the backup failed because the backup storage device was full. This is a critical finding: if the database crashed right now, recovery would only be possible back to Monday night, meaning one full day of business transactions (Tuesday) would be permanently lost. Immediate action: clear backup storage, rerun Tuesday backup manually.

T-Code: DB13 (DBA Planning Calendar) · Check: every morning- last backup status MUST be green · Escalate immediately if FAILED
3
HANA Memory Monitoring- HANA Cockpit

For S/4HANA on HANA, the most critical resource is memory- HANA is an in-memory database and requires all active data to fit in RAM. The HANA Cockpit shows: total memory installed, memory currently used by column store (main data), memory used by row store (system tables), and delta merge status. At Pramod Steel's S/4HANA landscape, HANA memory shows 1.2 TB used of 1.5 TB installed- 80% consumed. Memory alert is triggered and the team initiates a HANA unload of cold data (data not accessed in 6 months) to disk using HANA Native Storage Extension (NSE).

HANA Cockpit → Memory → Column Store · Alert: {'>'} 85% used = urgent · Action: NSE tier migration, unload cold data
4
Database Statistics and Index Maintenance- DB05 / HANA

DB05 (Database analysis) checks whether database statistics are up to date- outdated statistics cause the database query optimiser to make bad decisions about query execution plans, leading to unexpected full table scans. For HANA, statistics are maintained automatically. For Oracle and SQL Server, periodic statistics updates are critical. At Pramod Steel, DB05 shows table LIPS (delivery items) has statistics that are 45 days old- this table grew from 12 million to 28 million rows in that period. Updating statistics via DB05 → Schedule Statistics Update corrects the query optimiser and improves delivery reporting query performance by 60%.

T-Code: DB05 (database analysis, statistics) · DB20 (update statistics) · For HANA: automatic but verify in cockpit

🚛 Transport System Monitoring- Tracking Every Change to Production

The SAP Transport Management System (TMS) controls how configuration changes and development are moved from Development → Quality → Production. Every change that reaches production must go through a transport request- this is both a change control mechanism and an audit trail. Monitoring the transport system means ensuring that only approved changes reach production, that no transport has failed or been skipped, and that the production system reflects the exact set of approved configurations.

MONITORING 05

Pramod Technology Solutions Ltd- Transport Queue and Change Control Monitoring

1
Check Transport Queue- STMS

STMS (Transport Management System) shows the transport queue for each system in the landscape. Pooja opens STMS → Import Queue for the Production system (PRD). It shows 14 transport requests waiting for import. Of these, 12 are approved (green checkmark from the Change Advisory Board in the Service Now ticketing system), 1 is pending approval (import blocked), and 1 has an error status from the last import attempt (SYSLOG shows the transport failed due to a syntax error in a custom ABAP programme). The failed transport must be fixed in Development and re-exported before it can be re-imported to Production.

T-Code: STMS → Import Queue (PRD) · Check: status of all queued transports · Never import manually without CAB approval
2
Review Transport Logs- SE10 / STMS

SE10 shows all transport requests in the current system with their status. For each transport, clicking on the log icon shows a detailed log of every object included in the transport (programmes, table entries, customising settings) and whether each was imported successfully. The team reviews the transport that failed on PRD import: the log shows programme ZORDER_PRICE_CALC had a syntax error in the target system (PRD uses ECC 6.04 SP level while DEV was recently upgraded to SP05- a classic system level mismatch causing transport failures).

T-Code: SE10 (transport request list) · STMS (transport log viewer) · Check: every transport log post-import
3
Monitor Unauthorised Direct Changes in Production- SE06 / RDDPRCHK

One of the most serious control violations in SAP is a developer making a direct change in the Production system (bypassing the transport system). This can happen if a user has inappropriate development authorisation in PRD. The report RDDPRCHK (or via SE06 → Check) identifies any objects in PRD that were changed directly rather than through a transport. At Pramod Technology, the check reveals that user ABAP.LEAD directly modified table TVARVC (selection variables) in PRD without a transport- a common workaround for date parameter changes that is technically a control bypass and must be documented and reported.

T-Code: SE06 → Repair Flag Check · Report: RDDPRCHK · Policy: zero direct changes in PRD- all changes via transport only

⏰ Background Job Monitoring- The Heartbeat of SAP Operations

Background jobs are the automated engine of SAP- they run without user interaction to process payroll, close fiscal periods, run MRP, transfer data to external systems, and generate reports. A failed background job is often more serious than a system performance problem because it means a business process has silently not completed. Unlike user-interactive transactions where a user sees the error immediately, a background job failure may go unnoticed for hours or days if monitoring is not in place.

MONITORING 06

Pramod Agro Industries Ltd- Critical Background Job Monitoring

1
Daily Job Status Review- SM37

SM37 with selection: all users, all jobs, date = today, status = Cancelled or Aborted. At Pramod Agro, today's review shows 3 cancelled jobs: (1) ZSALES_REPORT_DAILY- a custom daily sales report, low priority. (2) RMMRP000 (MRP run)- critical, already restarted. (3) SAP_COLLECTOR_FOR_PERFMONITOR- system performance data collection job, medium priority. For each cancelled job, Pooja clicks on the job name → Job Log to see the exact error message. The job log is the most valuable diagnostic tool- it shows the exact ABAP error or system message that caused the cancellation.

T-Code: SM37 → filter Cancelled/Aborted · Click job name → Job Log for exact error · Restart critical jobs immediately
2
Check Jobs Running Longer Than Expected- SM37 / SM66

A job that is still Running when it should have completed 2 hours ago is as serious as a cancelled job. SM37 filtered for status = Active shows any currently running background jobs. Cross-referencing with SM66 shows the work processes they are using. At Pramod Agro, the month-end settlement job KO8G has been running for 6 hours (normal runtime: 90 minutes). SM66 shows the corresponding BGD work process is using 95% CPU. This is a runaway batch job- likely caused by a data volume issue or inefficient query introduced by the last transport. The Basis team decides to let it run for 2 more hours before taking the drastic step of cancelling and investigating the root cause.

T-Code: SM37 (Active filter) + SM66 cross-reference · Alert: job running {'>'} 150% of normal runtime = investigate
3
Monitor Job Schedule Integrity- SM36 / BTCJOB Report

It is not enough to check if jobs completed- you must also verify the job schedule itself is intact. Jobs can lose their schedule if the SAP system is restarted without the proper shutdown procedure. SM36 (Define Background Job) allows viewing and editing job schedules. Run report RBTCDEL2 to list all periodic jobs and their next scheduled run time. At Pramod Agro, the payment run F110 that should execute every Tuesday at 14:00 shows no next run scheduled- meaning it was lost during the emergency system restart last Friday. The schedule is recreated in SM36 before Tuesday arrives.

T-Code: SM36 (job schedule) · Report: RBTCDEL2 (all scheduled jobs) · Check: every job has a next run date in the future
Critical SAP Background Jobs- Must Monitor Daily
Job / ProgrammeModuleTypical ScheduleImpact if Failed
PC00_M40_CALCHR PayrollMonthly (payroll run date)Employees not paid- extremely critical
RMMRP000 / MD01NPP- MRPDaily or multiple times/dayProduction planning based on stale data
F110 (APP)FI- Vendor PaymentWeekly or twice weeklyVendors not paid- relationship damage
FAGL_FC_VALFI- FX RevaluationMonthly (period-end)Incorrect forex P&L for the period
AFABFI-AA- DepreciationMonthly (period-end)No depreciation posted- incorrect asset values
RSBTCDELBasis- Job log cleanupWeeklyJob log table (TBTCO) fills up- new jobs can't start
SAP_REORG_JOBSBasis- System cleanupWeeklySystem tables grow without bound
ZINTERFACE_SD_OUTCustom- InterfaceEvery 15 minutesExternal systems (WMS, e-commerce) not updated

💻 ABAP Code- Automating the Morning Health Check

Senior Basis teams write custom ABAP programmes to automate the morning health check- rather than manually running 10 T-codes, the programme runs all checks automatically at 6:00 AM and sends an email summary to the Basis team. The code below shows how to query critical monitoring tables programmatically- the same logic that commercial SAP monitoring tools like SAP Solution Manager and Focused Run use internally.

ABAP- Automated Morning Health Check Programme
"═══════════════════════════════════════════════════════════════
" SAP Basis- Automated Morning Health Check
" Checks: Cancelled jobs, ABAP dumps, failed updates, lock entries
" Schedule: Daily at 06:00 AM via SM36 background job
"═══════════════════════════════════════════════════════════════

REPORT z_basis_morning_check.

"──── 1. Check for cancelled background jobs last 24h ───────────
DATA: lt_jobs TYPE TABLE OF tbtco,
      lv_yesterday TYPE sy-datum.
lv_yesterday = sy-datum - 1.

SELECT jobname jobcount status strtdate strttime enddate
  FROM tbtco
  INTO TABLE @lt_jobs
  WHERE status = 'A'            "A = Cancelled/Aborted
    AND strtdate >= @lv_yesterday.

WRITE: / 'CANCELLED JOBS:', lines( lt_jobs ).

"──── 2. Check ABAP short dumps last 24h ────────────────────────
DATA: lt_dumps TYPE TABLE OF snap_dir,
      lv_count  TYPE i.

SELECT errty mandt progn usera mandt datum uzeit
  FROM snap_dir
  INTO TABLE @lt_dumps
  WHERE datum >= @lv_yesterday
  ORDER BY datum DESCENDING.

lv_count = lines( lt_dumps ).
WRITE: / 'ABAP DUMPS LAST 24H:', lv_count.

"──── 3. Check failed update records ─────────────────────────────
DATA: lt_updates TYPE TABLE OF vbhdr.
SELECT vbkey mandt bname cprog funcname vbdate vbtime
  FROM vbhdr
  INTO TABLE @lt_updates
  WHERE vbdate >= @lv_yesterday
    AND ( stat = 'V' OR stat = 'E' ). "V=Err V1, E=Err V2

IF lines( lt_updates ) {'>'} 0.
  WRITE: / '⚠ FAILED UPDATE RECORDS:', lines( lt_updates ).
ELSE.
  WRITE: / '✓ No failed update records.'.
ENDIF.

"──── 4. Check stale lock entries ────────────────────────────────
DATA: lt_locks TYPE TABLE OF enq_lock.
SELECT gname object mode enqmode guname luwid
  FROM enq_lock
  INTO TABLE @lt_locks.

WRITE: / 'CURRENT LOCK ENTRIES:', lines( lt_locks ).

"──── 5. Send email summary to Basis team ────────────────────────
DATA: lo_send TYPE REF TO cl_bcs.
lo_send = cl_bcs=>create_persistent( ).
"... (add email body from above results and send via BCS API)
💡

Key Monitoring Tables to Know: TBTCO = background job log (status, start/end times). SNAP_DIR = ABAP dump directory (every short dump). VBHDR = update request headers (failed updates). ENQ_LOCK = current lock entries (released at session end). SM20 data is stored in flat audit log files on the OS- not in a standard database table, which is why it cannot be queried via SE16N. To access SM20 data programmatically, use function module RSAU_READ_AUDIT_LOG.

📋 SAP System Monitoring- Complete T-Code Master Reference

Use this table as your definitive monitoring reference. Every Basis administrator and SAP consultant should know all of these transactions by heart- they come up in every Basis interview and every production support engagement.

SAP System Monitoring- All Key T-Codes by Category
Category T-Code Full Name What to Check Alert Threshold
System OverviewSM51Application Server ListAll servers active, no unexpected restartsAny server offline or restarted outside maintenance window
System OverviewAL08Global User OverviewCurrent logged-in user count by serverGhost sessions, unexpectedly high/low count
Work ProcessesSM50Work Process Overview (Local)Work process status- Wait/RunningDIA Running {'>'} 5 min, all WPs occupied
Work ProcessesSM66Global Work Process OverviewAll servers WP utilisation at onceServer at 100% WP = bottleneck
ABAP ErrorsST22ABAP Runtime Error AnalysisAll short dumps last 24hDBIF errors, recurring same programme
System LogSM21System LogInfrastructure errors and warningsWP restarts, DB failures, ICM errors
Background JobsSM37Job OverviewAll critical jobs status (Finished/Cancelled)Any critical job Cancelled or Aborted
Background JobsSM36Define Background JobJob schedule integrity (next run date)Critical job with no future run scheduled
UpdatesSM13Update RecordsFailed or pending update requestsAny ERR or INIT status record
LocksSM12Lock Entry ListStale lock entries from yesterdayLock BEGDA = yesterday = crash residue
SpoolSP01Spool Request OverviewFailed print jobsPrint jobs with error status
SpoolSP12Spool DatabaseSpool database size{'>'} 80% of allocated spool space
PerformanceSTWorkload MonitorTransaction response times- DB vs App splitDB time {'>'} 60% of total response time
PerformanceST05SQL TraceDatabase calls- missing indexes, full scansAny SELECT {'>'} 1 second, Full Table Scan
PerformanceST02Tune SummaryBuffer utilisation and swap countsSwap {'>'} 0 in any buffer = buffer too small
PerformanceST12ABAP TraceABAP programme execution time breakdownSingle programme consuming {'>'} 80% CPU time
DatabaseDB02Database Performance MonitorDatabase size, largest tables, tablespace{'>'} 80% database full = urgent expansion
DatabaseDB13DBA Planning CalendarBackup job status- last run resultANY failed backup = immediate escalation
SecuritySM20Security Audit LogFailed logons, sensitive T-code usage5+ failed logons, off-hours critical access
SecuritySUIMUser Information SystemSoD conflicts, SAP_ALL users, inactive usersAny non-DDIC user with SAP_ALL
TransportsSTMSTransport Management SystemTransport queue status, failed importsAny transport with error status in PRD queue
TransportsSE10Transport OrganiserTransport request details and import logsDirect changes in PRD without transport
SAP Monitoring- Scenario-Based Interview Questions
Interview QuestionAnswer ApproachKey T-Codes
Users report the system is slow. What do you do first?Check SM66 (WP saturation) → AL08 (user count) → ST (workload analysis: DB vs App time) → ST05 if DB time dominates.SM66 · AL08 · ST · ST05 · ST02
A critical background job did not run last night. How do you investigate?SM37 (find the job, check status) → click Job Log (exact error message) → SM21 (system events at that time) → restart job after fixing root cause.SM37 · SM21 · SM50
The system crashed unexpectedly and restarted. How do you assess the damage?SM21 (what caused the crash) → SM13 (any failed updates from the crash) → SM12 (stale lock entries) → SM37 (jobs that were running at crash time).SM21 · SM13 · SM12 · SM37
How do you prove no one made unauthorised changes to production?STMS (all transports imported to PRD with approval) → SE10/RDDPRCHK (no direct changes in PRD) → SM20 (no unauthorised transaction usage in production).STMS · SE10 · SM20 · SUIM
The database is growing very fast. What do you check and do?DB02 (largest tables, growth trend) → TAANA (table analysis) → start data archiving for MSEG, BSEG, TBTCO → increase storage if archiving not enough.DB02 · TAANA · DB13 · DB20

📘 Related SAP Tutorials

SAP Basis

SAP Audit Trail- Real Scenario

How SAP tracks every change across all modules- CDHDR, CDPOS, FI document trail, HR infotype log, and QM e-signatures.

Read Tutorial
SAP Scenarios

SAP Real Business Scenarios

End-to-end business process walkthroughs- Procure-to-Pay, Order-to-Cash, Hire-to-Retire with full T-codes and table flows.

Read Tutorial
SAP ABAP

SAP All Tables List

Complete reference to 100+ SAP database tables across all modules with key fields, descriptions, and primary keys.

Read Tutorial