Desktop Operation API
A set of objects to simulate mouse, keyboard and screen operations is provided in the leanpro.common
library of CukeTest.
Since the operation is implemented through simulation, it can be used on all platforms.
Uses
In CukeTest, there are advanced APIs for object recognition for direct control operations, and in some cases, automation still needs to be done through some combination of keyboard and mouse operations. Therefore, CukeTest provides a set of objects that simulate mouse and keyboard operations: Mouse
and Keyboard
. They can handle functions where there are no controls at the target location of the mouse movement, or where the object manipulation API can't do it. For example, some applications have menus that are hidden at the edge of the screen and need to be expanded only when the mouse is moved to the edge of the screen, in which case direct mouse and keyboard operations are required.
Enables automated scripts to be written using the mouse and keyboard:
const { Keyboard, Mouse } = require('leanpro.common');
Mouse.move(1920,1080);
Keyboard.keyDown("control");
Keyboard.keyTap("a");
Keyboard.keyUp("control");
from leanproAuto import Keyboard, Mouse
Mouse.move(1920,1080)
Keyboard.keyDown("control")
Keyboard.keyTap("a")
Keyboard.keyUp("control")
In addition, since Mouse
movement and click operations are built on the screen coordinate system, CukeTest additionally provides the object Screen
to operate on the screen, not only to obtain screen properties (resolution, etc.), but also to take screenshots.
Screen.capture();
Screen.capture()
Note that these are synchronous methods and do not require the use of
await
.
Mouse
Automation objects used to implement mouse movement, and click, press/release of individual mouse buttons. For example, in some applications, the interface is hidden at the edge of the screen and requires the mouse to move to the edge of the screen before it appears, so this scenario is not good to be implemented with control operations. And the Mouse
object is introduced to solve this kind of problem.
Type file definition
export class Mouse {
static move(x: number, y: number): void;
static moveSmooth(x: number, y: number, seconds?: number): void;
static drag(x: number, y: number, seconds?: number): void;
static setDelay(delay: number): void;
static position(): Point;
static click(button: MouseKey): void;
static doubleClick(button: MouseKey): void;
static keyDown(button: MouseKey): void;
static keyUp(button: MouseKey): void;
static wheel(vertical: number, horizontal: number): void;
}
enum MouseKey {
LButton = 1,
RButton = 2,
MButton = 4,
Ctrl = 8,
Shift = 16,
Alt = 32
}
interface Point {
x: number,
y: number
}
class Mouse():
def move(x: int, y: int) -> None
def moveSmooth(x: int, y: int, seconds: Optional[int]=None) -> None
def setDelay(delay: int) -> None
def drag(x: int, y: int, seconds: Optional[int]=None) -> None
def position() -> TypedDict
def click(button: Optional[int]=None) -> None
def doubleClick(button: Optional[int]=None) -> None
def keyDown(button: Optional[int]=None) -> None
def keyUp(button: Optional[int]=None) -> None
def wheel(vertical: int, horizontal: Optional[int]=None) -> None
class Point():
x: int
y: int
API Introduction
move(x, y): void
Move the mouse to the target position, the mouse button will be released before moving.
- x:
number
type, horizontal pixels of desktop coordinates. - y: type
number
, the vertical pixel of the desktop coordinate. - return value: synchronous method that does not return any value.
moveSmooth(x: number, y: number, seconds?: number): void
Smoothly move the mouse to the target position, the movement behavior is closer to manual operation.
- x:
number
type, horizontal pixels of desktop coordinates. - y: (optional)
number
type, the vertical pixels of the desktop coordinates. - seconds: (Optional)
number
type, the time taken for the action to complete, the smaller the value the faster the action. - return value: synchronous method that does not return any value.
drag(x: number, y: number, seconds?: number): void
Move the mouse to the target position, with keyDown()
and keyUp()
to complete the drag and drop operation.
- x:
number
type, the horizontal pixel of the desktop coordinate. - y:
number
type, the vertical pixel of the desktop coordinate. - seconds: (Optional)
number
type, the time taken for the action to complete, the smaller the value the faster the action. - return value: synchronous method that does not return any value.
setDelay(delay: number): void
All mouse operations have a 10ms wait time between them, you can also use this method to modify the wait time in milliseconds.
- delay:
number
type, the interval time in milliseconds. - return value: Synchronization method that does not return any value.
position(): Point
Get the mouse position.
- Return value:
Point
type. Shape is{x:100, y:100}
click(button: MouseKey): void
Completes a mouse click.
- button:
MouseKey
type, you can check the type file definition, it is enumerated, passMouseKey.LButton
and1
to have the same effect. - returnValue: Synchronous method that does not return any value.
doubleClick(button: MouseKey): void
Completes a mouse double click.
- button:
MouseKey
type, you can check the type file definition, it is enumerated, passMouseKey.LButton
and1
to have the same effect. - returnValue: Synchronous method that does not return any value.
keyDown(button: MouseKey): void
Press the mouse button, usually used to implement drag-and-drop operations.
- button:
MouseKey
type, you can check the type file definition, it is enumerated type, passMouseKey.LButton
and1
have the same effect. - returnValue: Synchronous method that does not return any value.
keyUp(button: MouseKey): void
Release the mouse button, usually used to implement drag-and-drop operations.
- button:
MouseKey
type, you can check the type file definition, it is enumerated type, passMouseKey.LButton
and1
have the same effect. - returnValue: Synchronous method that does not return any value.
wheel(vertical: number, horizontal: number): void
Scroll the mouse wheel, not only for regular vertical scrolling, but also for horizontal scrolling.
- vertical:
number
type, the value of vertical scrolling, scrolling up is positive, scrolling down is negative. - horizontal:
number
type, the value of horizontal scrolling, left scrolling is positive, right scrolling is negative. - return: synchronous method that does not return any value.
wheel()
The precision of the method is relatively high, so it is appropriate to take larger parameters when calling it, so that the phenomenon can be observed.
Scripts for drag-and-drop operations
The script that implements the drag-and-drop operation is as follows:
mouse.move(0, 0);
mouse.keyDown(MouseKey.LButton);
mouse.drag(100, 100);
mouse.keyUp(1);
mouse.move(0, 0)
mouse.keyDown(MouseKey.LButton)
mouse.drag(100, 100)
mouse.keyUp(1)
Keyboard
The corresponding simulation of the mouse is the simulation of keyboard operations, and keyboard-related operations belong to the Keyboard
object.
Type file definition
export class Keyboard {
static Keys: Keys,
static keyTap(key: string): void;
static unicodeTap(keyCode: number): void;
static keyDown(key: string): void;
static keyUp(key: string): void;
static setDelay(milliseconds: number): void;
// static typeString(str: string, cpm?: number): void; // deprecated
static pressKeys(keys: string, options?: PressKeysOptions | number): Promise<void>;
static disableIme();
}
class Keyboard():
def keyTap(key: str) -> None
def unicodeTap(keyCode: int) -> None
def keyDown(key: str) -> None
def keyUp(key: str) -> None
def setDelay(milliseconds: int) -> None
def typeString(str: str, cpm: Optional[int]=None) -> None
def pressKeys(keys: str, cpm: Optional[int]=None) -> None
def disableIme() -> None
API Introduction
keyTap(key): void
Press a key.
* key: `string` type, the key value of the target key, see [`Keys` table](#keys).
* return value: Synchronous method that does not return any value.
unicodeTap(key): void
Press a key with the value specified by Unicode. In JavaScript, you can call charCodeAt
on a string to get the key value. For example, the following code will output the string with the equivalent effect of calling typeString
.
'您好,中国(China)'.split('').map(k => Keyboard.unicodeTap(k.charCodeAt(0)));
- key:
string
type, the key value of the target key, seeKeys
table. - return value: Synchronous method that does not return any value.
keyDown(key): void
Press and hold a key.
- key:
string
type, the key value of the target key, seeKeys
table. - return value: Synchronous method that does not return any value.
keyUp(key): void
Frees a key.
- key:
string
type, the key value of the target key, seeKeys
table. - return value: Synchronous method that does not return any value.
setDelay(delay): void
Controls the interval of each keyboard operation, default is 10ms.
- delay:
number
type, the interval time in milliseconds. - return value: synchronization method that does not return any value.
typeString(str, cpm): Promise<void>
Deprecated, suggest using
pressKeys(str, {textOnly: true})
instead.
Enter a string of characters.
- str:
string
type, the string to be entered. - cpm(optional):
number
type,Character Per Minute
, i.e. characters entered in each minute, controls the speed from which characters are entered. If some applications under test can't process characters in time when they are entered fast enough to cause abnormal display, this can be solved by setting a lower cpm, e.g. cpm = 60; - Return value: Asynchronous method that does not return any value.
pressKeys(keys, options?): Promise<void>
Inputting a key or string will focus the target control before input. When a string is passed in, some special characters in the string (^+~%{}()
) will be executed as control keys (Shift
key, CTRL
key, etc.) without entering the corresponding symbols, For details, please refer to Appendix: Input Key Correspondence Table. If you wish to enter plain text, ignoring these control key symbols, you can use the {textOnly: true}
option, called thus: pressKeys(str, {textOnly: true})
.
- keys:
string
type, the keys, key combinations or strings to be entered, supports up to 1024 characters. - options: (optional) Some optional parameters to control the input mode.
- textOnly: Enter only the string, and treat the control characters as text as well. The effect is equivalent to calling
Keyboard.typeString()
. - cpm: The number of characters per minute, used to control the speed of text input. It is recommended that the
cpm
value be set to 200 or higher for automated operation. Due to the internal implementation of the method, and the different processing of text input by each system and application, the actual input speed may not always reach the setcpm
. When options is a number, it is equivalent to the cpm parameter.
- textOnly: Enter only the string, and treat the control characters as text as well. The effect is equivalent to calling
- Return value: Asynchronous method that does not return any value. For more instructions or samples, please refer toSimulating Keyboard Operations。
disableIme(): void
Disable the input method of the current focus application, meaning that it switches to the Chinese keyboard. This way, when calling other methods that input keyboard operations, such as typeString or pressKeys, they will not be interfered by the input method.
- Return Value: Asynchronous method that does not return any value. Note: The input method is application-dependent. If a new application under test is started, it may be necessary to call disableIme() again.
Keys
Keys
is an enumeration object that enumerates all the keys used as parameters for the keyTap
, keyDown
, and keyUp
methods. The key names and descriptions are shown in the following table.
For letter keys and number keys it is sufficient to use the characters with the same name directly, so they are not listed in the table below. For example, if you press the letter key b and the number key 5, you can pass the parameters "b"
and "5"
directly.
Windows logo key: As a control key unique to the Windows keyboard, it can be used with the
"command"
key name, just like the cmd key for Mac.
Key name | Descriptions | Remarks |
---|---|---|
backspace | ||
delete | ||
enter | ||
tab | ||
escape | ||
up | Up Arrow Key | |
down | Down Arrow Key | |
right | Right arrow key | |
left | Left arrow key | |
home | ||
end | ||
pageup | ||
pagedown | ||
f1 | ||
f2 | ||
f3 | ||
f4 | ||
f5 | ||
f6 | ||
f7 | ||
f8 | ||
f9 | ||
f10 | ||
f11 | ||
f12 | ||
command | CMD key or Windows key (depending on the system) | |
alt | ||
control | ||
shift | ||
right_shift | ||
space | ||
printscreen | Mac is not supported | |
insert | Mac is not supported | |
audio_mute | Mute | |
audio_vol_down | Volume reduction | |
audio_vol_up | Turn up the volume | |
audio_play | Playback | |
audio_stop | Stop | |
audio_pause | Pause | |
audio_prev | Previous | |
audio_next | Next | |
audio_rewind | Only valid for Linux | |
audio_forward | Only valid for Linux | |
audio_repeat | Only valid for Linux | |
audio_random | Only valid for Linux | |
numpad_0 | Numeric Keypad 0 | |
numpad_1 | Numeric Keypad 1 | |
numpad_2 | Numeric Keypad 2 | |
numpad_3 | Numeric Keypad 3 | |
numpad_4 | Numeric Keypad 4 | |
numpad_5 | Numeric Keypad 5 | |
numpad_6 | Numeric Keypad 6 | |
numpad_7 | Numeric Keypad 7 | |
numpad_8 | Numeric Keypad 8 | |
numpad_9 | Numeric Keypad 9 | |
numpad_+ | Numeric Keypad + | |
numpad_- | Numeric Keypad - | |
numpad_* | Numeric Keypad * | |
numpad_/ | Numeric Keypad / | |
numpad_. | Numeric Keypad . | |
lights_mon_up | Increase monitor brightness | Windows does not support |
lights_mon_down | Reduce monitor brightness | Windows does not support |
lights_kbd_toggle | On/off keypad backlight | Windows does not support |
lights_kbd_up | Increase the brightness of the keyboard backlight | Windows does not support |
lights_kbd_down | Reduce the brightness of the keyboard backlight | Windows does not support |
Usage of keys
The Keys
enumeration class is often used for special keys and has two uses. For example, when the CTRL
key and the A
key need to be pressed simultaneously to perform a select-all operation, it can be written like this.
const { Keyboard } = require('leanpro.common');
Keyboard.keyDown('control')
Keyboard.keyTap('a');
Keyboard.keyUp('control')
from leanproAuto import Keyboard
Keyboard.keyDown('control')
Keyboard.keyTap('a')
Keyboard.keyUp('control')
Since the keyTap
, keyDown
, and keyUp
methods only receive a single character (e.g., 'a'
) by default, if a string is received (e.g., 'control'
), the Keys
enumeration class is automatically used, which is equivalent to introducing the Keys
enumeration class, as follows:
const { Keyboard, Keys } = require('leanpro.common');
Keyboard.keyDown(Keys.control)
Keyboard.keyTap('a');
Keyboard.keyUp(Keys.control)
If you wish to enter a string directly, you should use the
typeString
method。
Screen
Screen automation object Screen
, used to get screen properties, and manipulate the screen.
Type file definition
The type file is defined as follows:
export class Screen {
static screenRect(moniter?: number): Rect
static all(moniter?: number): Rect[]
static colorAt(x: number, y: number): string;
static capture(rect?: Rect): Buffer;
static captureToFile(filePath: string, rect?: Rect): void;
static takeScreenshot(filePath: string, monitor?: number): string | void;
// deprecated Abandoned
static screenSize(): {width: number, height: number};
}
class Screen():
def all() -> "List[Rect]"
def screenRect(monitor: Optional[int]=None) -> "Rect"
def screenSize() -> TypedDict
def colorAt(x: int, y: int) -> str
def capture(rect: Optional[Rect]=None) -> "bytearray"
def captureToFile(filePath: str, rect: Optional[Rect]=None) -> None
def takeScreenshot(filePath: str, monitor: Optional[int]=None) -> Union[str, None]
class ScreenSize():
width: int
height: int
API Introduction
screenRect(monitor?:number): Rect
Get the screen rectangle, the default take all the screen combined rectangle, if you specify the display number monitor
parameter will only get the size of the corresponding screen.
- monitor: (Optional) The monitor number. Passing no parameter or
-1
will get the size of all screens combined, numerically equal to the size of the screenshot obtained by the Screen.capture method. Pass 0 to return the first display, 1 to return the second, and so on. - Return value:
Rect
object, an object that represents the screen size.
all(): Rect[]
Get a list of all display rectangular boxes, returning an array of Rect
objects.
- Return value: an array consisting of
Rect
objects, an array of objects with the size of all screens.
screenSize(): {width: number, height: number}
Not recommended, you can use screenRect instead. Get the main screen size, i.e. the resolution size. Returns an object containing the width and height of the screen.
- Return value:
{width: number, height: number}
, i.e. an object containing the width and height of the screen, both are of type number.
colorAt(x: number, y: number): string
Get the color of a pixel point at a coordinate, the return value is a string of RGB color codes in hexadecimal format, such as "FFFFFFF"
.
- x:
number
type, horizontal coordinate; - y:
number
type, vertical coordinate; - Return value:
string
type, a string of RGB color codes in hexadecimal format.
capture(rect?: Rect): Buffer
Get a screenshot. If the rect
parameter is specified, only the screenshot of the specified box will be taken.
- rect: (Optional)
Rect
type, see Rect type introduction for definition; - Return value:
Buffer
type. Can be used directly in Report Attachments, or to constructImage
objects using theImage.from()
method.
captureToFile(filePath: string, rect?: Rect): void
Get a screenshot and save it as a file. filePath
is the path as well as the file name. If the rect
parameter is specified, only the screenshot of the specified box will be taken.
- filePath:
string
type, the path to save the file and the file name, e.g.". /support/image1.png"
; - rect: (optional)
Rect
type, see Rect type introduction for definition; - Return value: Does not return any value.
takeScreenshot(filePath: string, monitor?: number): string
Merged from the original Util.takeScreenshot
method to get a screenshot. See takeScreenshot() method introduction for details.
Not recommended.