Getting process info: Win32 and COM with python and C++¶
Existing document
\win32\help\process_info.html
Section author: John Nielsen <jn@who.net>
Python is a rich scripting language offering a lot of the power of C++ while retaining the ease of use of VBscript. With it’s simplified C++ style, win32 access, and ability to make COM servers, it’s a natural rapid development environment for the developer.
When working with an unfamiliar API, python is great for helping you understand how to solve the problem without getting in the way. Even if you have to supply a C++ COM object, it is often easier to first figure out the details with python and then compose the C++ solution. Python is very similar to C++ pseudo-code, so you can follow it as an outline for the C++. In this case, we’re going to talk about how to use both python and C++ to expose a list of processes and their corresponding ids with a COM object.
Introduction¶
To get process information for both NT and W2K (but not the 9x family) you can use the Performance Data Helper library(PDH) available in the SDK at microsoft’s ftp site. It provides a convenient interface to performance information stored fundamentally in the registry. The basic process of using the PDH encompasses the following:
Get a list of all the objects you want
Get a list of the object’s instances and data available for each instance: called ‘items’ or ‘counters’
Get a group of performance data for each counter
In our case the object we want is the process object, the object’s instances are it’s list of processes, and the counter we want for the processes is ‘ID Process’.
The specific set of API called are the following:
PdhEnumObjectItems – get the list of instances (processes) for the process object
PdhOpenQuery – initialize query handle which will contain all the counters (in our case only: ID Process)
PdhMakeCounterPath – Each counter has a specific path, kind of like like a url
PdhAddCounter – convert the path to a handle and group the counters paths(in this case only one) together for the query
PdhCollectQueryData – gathers the actual data
PdhGetFormattedCounterValue – call for each counter to convert the data to a typical format
PdhCloseQuery – close the counters and the query handle
We’ll cover these points now in more depth.
Getting the process list: PdhEnumObjectItems¶
The MSDN describes the call as the following:
1PDH_STATUS PdhEnumObjectItems(
2 LPCTSTR szDataSource,
3 LPCTSTR szMachineName,
4 LPCTSTR szObjectName,
5 LPTSTR mszCounterList,
6 LPDWORD pcchCounterListLength,
7 LPTSTR mszInstanceList,
8 LPDWORD pcchInstanceListLength,
9 DWORD dwDetailLevel,
10 DWORD dwFlags
11);
The python call is similar though simpler. For example, you do not need to bother with the list length – it takes care of that for you. Both the python and C++ examples are taken from their COM components shown later.To call make with python would look like the following:
1def proclist(self):
2 try:
3 junk, instances = win32pdh.EnumObjectItems(
4 None,
5 None,
6 self.object,
7 win32pdh.PERF_DETAIL_WIZARD
8 )
9
10 return instances
11
12 except:
13
14 raise COMException("Problem getting process list")
The variable instances contains the list of processes, and you’ll find the item or counter than we want ‘ID Process’ present in the list of items. Since you can have multiple processes with the same name, in python it is convenient to use a dictionary to store a list of processes and how many you found for each type.
1for instance in instances:
2 if proc_dict.has_key(instance):
3 proc_dict[instance] = proc_dict[instance] + 1
4 else:
5 proc_dict[instance]=0
The C++ call though essentially the same in spirit is much more involved. To help, I use map (like a python dictionary) and string from Standard C++. The additional things you need to manage are:
strings – convert to TCHAR for ansi/unicode support using ATL macros
the memory for the buffers that PdhEnumObjectItems needs
Parse it’s NULL padded results.
1HRESULT getinst (map < string,int > & m_inst) {
2
3 map<string, int>::iterator iter;
4 USES_CONVERSION;
5 LPTSTR szCountersBuf = NULL;
6 DWORD dwCountersSize = 0;
7 LPTSTR szInstancesBuf = NULL;
8 DWORD dwInstancesSize = 0;
9 LPTSTR szTemp = NULL;
10
11 PDH_STATUS status;
12
13 std::string str_obj="Process";
14
15 status = PdhEnumObjectItems(
16 NULL,
17 NULL,
18 A2CT(str_obj.c_str()),
19 NULL,
20 &dwCountersSize,
21 szInstancesBuf,
22 &dwInstancesSize,
23 PERF_DETAIL_WIZARD,
24 0 );
25
26 if ( ERROR_SUCCESS != status )
27 return E_FAIL;
28
29 if (dwCountersSize) {
30 szCountersBuf = (LPTSTR)malloc (dwCountersSize * sizeof (TCHAR));
31 if (szCountersBuf==NULL) {
32 return E_FAIL;
33 }
34 } else
35 szCountersBuf=NULL;
36
37 if (dwInstancesSize) {
38 szInstancesBuf = (LPTSTR)malloc (dwInstancesSize * sizeof (TCHAR));
39 if (szInstancesBuf==NULL) {
40 free(szCountersBuf);
41 return E_FAIL;
42 }
43 } else
44 szInstancesBuf = NULL;
45
46 status = PdhEnumObjectItems(
47 NULL,
48 NULL,
49 A2CT(str_obj.c_str()),
50 szCountersBuf,
51 &dwCountersSize,
52 szInstancesBuf,
53 &dwInstancesSize,
54 PERF_DETAIL_WIZARD,
55 0);
56
57 if ( ERROR_SUCCESS != status )
58 return E_FAIL;
59
60 //it's a series of contingous NULL terminated strings, ending w/zero length string
61 if (szInstancesBuf){
62 for (szTemp = szInstancesBuf;*szTemp != 0;szTemp += lstrlen(szTemp) + 1) {
63 m_inst[T2A(szTemp)]++; //increment instance counter
64 //default value is zero for arith element
65 }
66 }
67
68 return S_OK;
69
70}
Getting the process ids: Several Pdh Calls¶
A whole sequence of calls are necessary once you get the process list. To refresh your memory, you need:
PdhOpenQuery – initialize query handle which will contain all the counters (in our case only: ID Process)
PdhMakeCounterPath – Each counter has a specific path, kind of like like a url
PdhAddCounter – convert the path to a handle and group the counters paths(in this case only one) together for the query
PdhCollectQueryData – gathers the actual data
PdhGetFormattedCounterValue – call for each counter to convert the data to a typical format
PdhCloseQuery – close the counters and the query handle
As usual the python code matches this very cleanly (like pseudocode that actually runs) – you can figure out the basic meanings and sequences of the win32 calls without having to deal with other details.
1for instance, max_instances in proc_dict.items():
2 for inum in xrange(max_instances+1):
3 try:
4 hq = win32pdh.OpenQuery() # initializes the query handle
5 path = win32pdh.MakeCounterPath( (None,self.object,instance, None, inum, self.item) )
6 counter_handle=win32pdh.AddCounter(hq, path) #convert counter path to counter handle
7 win32pdh.CollectQueryData(hq) #collects data for the counter
8 type, val = win32pdh.GetFormattedCounterValue(counter_handle, win32pdh.PDH_FMT_LONG)
9 proc_ids.append(instance+'\t'+str(val))
10 win32pdh.CloseQuery(hq)
11 except:
12 raise COMException("Problem getting process id")
Again, the C++ code is more involved and makes use of Standand C++, vector, map, and string. It converts a map of process names and the number for each name, to a vector of strings each which has tab-delimited process id entry. Also, the process id info is returned in the format of a double, which is converted to a string.
1HRESULT getprocid (map<string,int>& m_inst, vector<string> &v_ids) { USES_CONVERSION; PDH_STATUS status = 0; HQUERY hQuery = NULL; HCOUNTER hCounter = NULL; DWORD dwType = 0; map<string,int> m_idinst; std::string objname="Process"; std::string counter="ID Process"; char *buffer;int junk,junk2; // initialize the query handle map < string, int>::iterator iter; for (iter=m_inst.begin();iter != m_inst.end();++iter) { for (int i=0;i<= iter->second;++i) { status = PdhOpenQuery( NULL, 0, &hQuery ); if ( status != ERROR_SUCCESS ) return status; TCHAR szCounterPath[2048]; DWORD dwPathSize = 2048; PDH_COUNTER_PATH_ELEMENTS pdh_elm; pdh_elm.szMachineName = NULL; pdh_elm.szObjectName = A2T(objname.c_str()); pdh_elm.szInstanceName = A2T(iter->first.c_str()); pdh_elm.szParentInstance = NULL; pdh_elm.dwInstanceIndex = i; pdh_elm.szCounterName = A2T(counter.c_str()); status = PdhMakeCounterPath( &pdh_elm, szCounterPath, &dwPathSize, 0 ); if ( status != ERROR_SUCCESS ) { return E_FAIL; } // Add the counter to the query //PdhAddCounter converts each counter path into a counter handle status = PdhAddCounter( hQuery, szCounterPath, 0, &hCounter ); if ( status != ERROR_SUCCESS ) { return E_FAIL; } //PdhCollectQueryData gets raw data for the counters status = PdhCollectQueryData(hQuery); if ( status != ERROR_SUCCESS ) { return E_FAIL; } //PdhGetFormattedCounterValue formats counter values for display DWORD dwFormat = PDH_FMT_DOUBLE; PDH_FMT_COUNTERVALUE fmtValue; status = PdhGetFormattedCounterValue (hCounter, dwFormat, (LPDWORD)NULL, &fmtValue); if (status == ERROR_SUCCESS) { buffer=_fcvt( fmtValue.doubleValue, 0, &junk,&junk2 ); string id=buffer; v_ids.push_back(iter->first+'\t'+id); } } status = PdhCloseQuery (hQuery); //PdhCloseQuery closes the query handle and it's counters } return S_OK; }
The COM client¶
The Python COM client which will call the C++ COM object and Python COM object does the following:
1import win32com.client a=win32com.client.Dispatch('NtPerf.process') # C++ com object print a.procids() b=win32com.client.Dispatch('PyPerf.process') # python com object print b.procids()
As far as it is concerned, there is no difference between the 2 objects. Both returns a list of processes and their respective id’s seperated by tab.
Making the COM objects¶
From a 1000 mile perspective, ATL C++ and python offer a class based COM object approach. In both approaches, the methods of the com object are simply methods of a class. However, creating a COM object in python is much easier than C++, again allowing you to focus on the problem first while still retaining the C++ feel.
Much of the details with python COM objects are exposed through a default policy which leverages IDispatch. You simply add a few attributes your python class, to expose your methods, prog id, add a line to register your class, and you are done. The policy knows what to do. Thus, it’s easy enough to take a simple win32 class you wrote and add a few attributes and convert it to a COM object. Creating and developing python COM objects is simple, all that is needed is notepad. You don’t need a full blown IDE nor do you have to go back and find the source code that created the object, since the object is the source code. It lends itself to very rapid development.
ATL provides wizards and a lot of the basic implementation goo (like COM interfaces for IUnknown and IDispatch), and wrappers for data types. However, there still a big difference between a simple console based app and an ATL COM object. The COM world of Variants, SafeArrays, and BSTR’s is (as we’ll see below) unfriendly to C++.
Python behind the scenes converts back and forth between it’s native types and BSTR, Variants, SafeArrays, etc. When you change your COM object, you don’t have to worry about changing IDL and constructing new Variant structures, it is managed for you.
In C++, it is more messy. First of all: strings. Since the concept of what text is isn’t consistent, COM standardizes with it’s own OLECHAR. Also, for non-COM text, because of issues between using ansi and unicode, you need to use the TCHAR data type which is a generic type that maps at compile time to what is necessary. And, with regard to BSTRs (length-prefixed strings), you need to convert the OLECHAR to BSTR’s with SysAllocString. You notice the COM object below uses the Standard C++ string (which I prefer to use). When necessary, it converts the string it to the necessary COM type leveraging ATL conversion macros and SysAllocString (another option is to use the CComBSTR class).
Secondly: arrays. Since we are returning a list of processes and their ids, to be friendly with all languages, everything needs to be converted to SafeArrays of Variants housing BSTRs. Python does the conversion for you. With C++, you’ll need to make various Variant calls to create the necessary structure. In the C++ COM object, I’ve encapsulated all the necessary code in a single function that converts any vector of strings into a 1 dimensional safe array of variants.
Now for some code:
The Python COM object¶
The Python COM object has 2 methods proclist and procids. Proclist is trivial, simply returning the list of processes from EnumObjectItems. Procids calls proclist, constructs a dictionary to count the number of processes with the same name, and then makes the necessary calls to their their ids. Each function simple returns the python list (which is then converted for you). If you later decide to only return a single string, simple change what you return, and python again will convert for you. Each method also uses the very cool function COMException which returns errors back to the client. In addition to the methods, there are 4 basic attributes, I set to define the COM object and a line to register it.
1import win32pdh, string, win32api from win32com.server.exception import COMException import win32com.server.util import win32com.client.dynamic #to generate guids use: #import pythoncom #print pythoncom.CreateGuid() class pyperf: # COM attributes. _reg_clsid_ = '{763AE791-1D6B-11D4-A38B-00902798B22B}' #guid for your class in registry _reg_desc_ = "get process list and ids" _reg_progid_ = "PyPerf.process" #The progid for this class _public_methods_ = ['procids','proclist' ] #names of callable methods def __init__(self): self.object='process' self.item='ID Process' def proclist(self): try: junk, instances = win32pdh.EnumObjectItems(None,None,self.object, win32pdh.PERF_DETAIL_WIZARD) return instances except: raise COMException("Problem getting process list") def procids(self): #each instance is a process, you can have multiple processes w/same name instances=self.proclist() proc_ids=[] proc_dict={} for instance in instances: if proc_dict.has_key(instance): proc_dict[instance] = proc_dict[instance] + 1 else: proc_dict[instance]=0 for instance, max_instances in proc_dict.items(): for inum in xrange(max_instances+1): try: hq = win32pdh.OpenQuery() # initializes the query handle path = win32pdh.MakeCounterPath( (None,self.object,instance, None, inum, self.item) ) counter_handle=win32pdh.AddCounter(hq, path) #convert counter path to counter handle win32pdh.CollectQueryData(hq) #collects data for the counter type, val = win32pdh.GetFormattedCounterValue(counter_handle, win32pdh.PDH_FMT_LONG) proc_ids.append(instance+'\t'+str(val)) win32pdh.CloseQuery(hq) except: raise COMException("Problem getting process id") proc_ids.sort() return proc_ids if __name__=='__main__': import win32com.server.register win32com.server.register.UseCommandLine(pyperf)
The C++ COM object¶
As you notice from the idl, C++ COM object also exposes 2 methods, proclist and procids. Proclist calls the getinst function returns returns a map of processes, converts that to a vector of strings, and the calls make_safe to convert that to a Safe array of Variants. Procids does much the same except that after calling getinst, it then calls getprocid, which returns a vector of strings containing the processes and their ids. The vector of strings is then converted to a SafeArray with make_safe. Unlike python, you don’t actually return the SafeArray(since every COM method has to return an HRESULT). Instead, you store the values in a Variant pointer.
Here is the relevant excerpt from the IDL:
1interface Iprocess : IDispatch
2 {
3 [id(1), helpstring("lists current processes")] HRESULT proclist([out, retval] VARIANT *plist);
4 [id(2), helpstring("method procids")] HRESULT procids([out, retval] VARIANT *pids);
5 }
Here is the source for the cpp file:
1#include "stdafx.h"
2#include "Ntperf.h"
3#include "process.h"
4
5//MS stuff
6#include "pdh.h"
7#include "pdhmsg.h"
8
9// fix problem with different versions of pdh.dll
10#undef PdhOpenQuery // PdhOpenQueryA
11extern "C" long __stdcall
12PdhOpenQuery (
13 IN LPCSTR szDataSource,
14 IN DWORD dwUserData,
15 IN HQUERY *phQuery
16 );
17
18//STD C++ stuff
19#pragma warning(disable : 4786) //get rid of stl warnings
20#include <string>
21#include <vector>
22#include <map>
23using namespace std;
24
25/////////////////////////////////////////////////////////////////////////////
26// Cprocess
27
28HRESULT make_safe(vector<string>& v_list, VARIANT *plist) {
29
30 HRESULT hr = S_OK;
31
32 USES_CONVERSION;
33
34 VariantInit(plist);
35 plist->vt = VT_ARRAY | VT_VARIANT; //set type of plist to variant array
36
37 //now create the 1 dimensional safearray of variants
38 LPSAFEARRAY psa;
39 SAFEARRAYBOUND rgsabound[] = { v_list.size(), 0 }; // size elements, 0-based
40 psa = SafeArrayCreate(VT_VARIANT, 1, rgsabound);
41 if (!psa) { return E_OUTOFMEMORY; }
42
43 VARIANT * VarArray;
44 //Increment lock count and get pointer to the array data
45 if (FAILED(hr = SafeArrayAccessData(psa,(void **) &VarArray ))) {
46 return hr;
47 }
48
49 for (int i =0; i<v_list.size();i++) {
50 VarArray[i].vt = VT_BSTR;
51 //convert ascii to olestr then bstr
52 VarArray[i].bstrVal = SysAllocString(A2OLE(v_list[i].c_str()));
53 if (!VarArray[i].bstrVal) {
54 VariantClear(VarArray);
55 return hr = E_OUTOFMEMORY;
56 }
57
58 }
59
60 SafeArrayUnaccessData( psa );
61
62 plist->parray = psa; //now set the array in plist to be the created array
63
64 return S_OK;
65}
66
67HRESULT getprocid (map<string,int>& m_inst, vector<string> &v_ids) {
68
69 USES_CONVERSION;
70 PDH_STATUS status = 0;
71 HQUERY hQuery = NULL;
72 HCOUNTER hCounter = NULL;
73 DWORD dwType = 0;
74
75 map<string,int> m_idinst;
76
77 std::string objname="Process";
78 std::string counter="ID Process";
79 char *buffer;int junk,junk2;
80
81 // initialize the query handle
82
83 map<string, int>::iterator iter;
84 for (iter=m_inst.begin();iter != m_inst.end();++iter) {
85
86 for (int i=0;i<= iter->second;++i) {
87 status = PdhOpenQuery( NULL, 0, &hQuery );
88 if ( status != ERROR_SUCCESS )
89 return status;
90
91 TCHAR szCounterPath[2048];
92 DWORD dwPathSize = 2048;
93 PDH_COUNTER_PATH_ELEMENTS pdh_elm;
94
95 pdh_elm.szMachineName = NULL;
96 pdh_elm.szObjectName = A2T(objname.c_str());
97 pdh_elm.szInstanceName = A2T(iter->first.c_str());
98 pdh_elm.szParentInstance = NULL;
99 pdh_elm.dwInstanceIndex = i;
100 pdh_elm.szCounterName = A2T(counter.c_str());
101
102 status = PdhMakeCounterPath( &pdh_elm, szCounterPath, &dwPathSize, 0 );
103 if ( status != ERROR_SUCCESS ) { return E_FAIL; }
104
105
106 // Add the counter to the query
107 //PdhAddCounter converts each counter path into a counter handle
108 status = PdhAddCounter( hQuery, szCounterPath, 0, &hCounter );
109 if ( status != ERROR_SUCCESS ) { return E_FAIL; }
110
111 //PdhCollectQueryData gets raw data for the counters
112 status = PdhCollectQueryData(hQuery);
113
114 if ( status != ERROR_SUCCESS ) { return E_FAIL; }
115
116 //PdhGetFormattedCounterValue formats counter values for display
117 DWORD dwFormat = PDH_FMT_DOUBLE;
118 PDH_FMT_COUNTERVALUE fmtValue;
119 status = PdhGetFormattedCounterValue (hCounter,
120 dwFormat,
121 (LPDWORD)NULL,
122 &fmtValue);
123
124
125 if (status == ERROR_SUCCESS) {
126 buffer=_fcvt( fmtValue.doubleValue, 0, &junk,&junk2 );
127 string id=buffer;
128 v_ids.push_back(iter->first+'\t'+id);
129 }
130
131 }
132 status = PdhCloseQuery (hQuery);
133 //PdhCloseQuery closes the query handle and it's counters
134 }
135
136
137 return S_OK;
138}
139
140HRESULT getinst (map<string,int>& m_inst) {
141
142 map<string, int>::iterator iter;
143 USES_CONVERSION;
144 LPTSTR szCountersBuf = NULL;
145 DWORD dwCountersSize = 0;
146 LPTSTR szInstancesBuf = NULL;
147 DWORD dwInstancesSize = 0;
148 LPTSTR szTemp = NULL;
149
150 PDH_STATUS status;
151
152 std::string str_obj="Process";
153
154 status = PdhEnumObjectItems(
155 NULL,
156 NULL,
157 A2CT(str_obj.c_str()),
158 NULL,
159 &dwCountersSize,
160 szInstancesBuf,
161 &dwInstancesSize,
162 PERF_DETAIL_WIZARD,
163 0 );
164
165 if ( ERROR_SUCCESS != status )
166 return E_FAIL;
167
168 if (dwCountersSize) {
169 szCountersBuf = (LPTSTR)malloc (dwCountersSize * sizeof (TCHAR));
170 if (szCountersBuf==NULL) {
171 return E_FAIL;
172 }
173 } else
174 szCountersBuf=NULL;
175
176 if (dwInstancesSize) {
177 szInstancesBuf = (LPTSTR)malloc (dwInstancesSize * sizeof (TCHAR));
178 if (szInstancesBuf==NULL) {
179 free(szCountersBuf);
180 return E_FAIL;
181 }
182 } else
183 szInstancesBuf = NULL;
184
185 status = PdhEnumObjectItems(
186 NULL,
187 NULL,
188 A2CT(str_obj.c_str()),
189 szCountersBuf,
190 &dwCountersSize,
191 szInstancesBuf,
192 &dwInstancesSize,
193 PERF_DETAIL_WIZARD
194 0);
195
196 if ( ERROR_SUCCESS != status )
197 return E_FAIL;
198
199 //it's a series of contingous NULL terminated strings, ending w/zero length string
200
201
202 if (szInstancesBuf){
203 for (szTemp = szInstancesBuf;*szTemp != 0;szTemp += lstrlen(szTemp) + 1) {
204 m_inst[T2A(szTemp)]++; //increment instance counter
205 //default value is zero for arith element
206 }
207 }
208
209
210
211 return S_OK;
212}
213
214 STDMETHODIMP Cprocess::proclist(VARIANT *plist)
215 {
216
217 if (!plist) { return E_INVALIDARG;}
218
219 HRESULT hr = NOERROR;
220
221 vector<string> test(50, "hello");
222
223 map<string,int> m_inst;
224 map<string, int>::iterator iter;
225
226 hr=getinst(m_inst);
227 if FAILED(hr) {return hr;}
228
229 vector<string> v_inst;
230 for (iter=m_inst.begin();iter != m_inst.end();++iter) {
231 //go through index of processes
232 for(int i=0;i<iter->second;i++){
233 //put onto vector multiple procs w/same name
234 v_inst.push_back(iter->first);
235 }
236 }
237
238 //send a vector of strings and a variant to make_safe
239 hr = make_safe(v_inst, plist);
240
241 //getprocid(m_inst);
242
243
244 return hr;
245
246 }
247
248 STDMETHODIMP Cprocess::procids(VARIANT *pids)
249 {
250 // TODO: Add your implementation code here
251 if (!pids) { return E_INVALIDARG;}
252
253 HRESULT hr = NOERROR;
254
255 map<string,int> m_inst;
256 map<string, int>::iterator iter;
257
258 hr=getinst(m_inst);
259 if FAILED(hr) {return hr;}
260
261 vector<string> v_ids;
262 getprocid(m_inst,v_ids);
263
264 //send a vector of strings and a variant to make_safe
265 hr = make_safe(v_ids, pids);
266
267 //getprocid(m_inst);
268
269 return hr;
270}
In Conclusion¶
That was a quick tour of Python and C++ in the win32 and COM world. Both languages have their strengths and weaknesses. With C++ you have ultimate granularity and power. It obviously comes at a cost of more details to keep track of. Python’s strength is rich productivity. It is fast to write the win32 and COM sever code, yet still have a sophisticated language at your disposal. You lose some of the flexibility of C++, which often does not matter. And, when it does, python can help you understand how to solve the problem, before wading into the details.
Have a great time with programming with python!
Further Info¶
Pdh stuff found at ftp://ftp.microsoft.com in something similar to /developr/platformsdk/april2000/x86/redist/pdh Mirosoft MSDN references at http://msdn.microsoft.com Relevant Pdh Python libraries: win32pdh.py, win32pdhutil.py