Motivation
The link between core data types (such as Capstone, Insn, InsnDetail, etc.) and architecture-specific data types (the data types under crate::arch) is not that "explicit" on the typing. For example, to get the instruction ID on the x86 arch, we have to code:
let cs = Capstone::new().x86().build().unwrap();
let insns = cs.disasm_all(CODE, 0x1000).unwrap();
for i in insns.as_ref() {
let insn_id = unsafe {
// Neither typing nor documentation mention this!
std::mem::transmute::<_, X86Insn>(i.id().0)
};
// do something with insn_id
}
which is not easier for beginners to catch.
We can simply resolve the problem above by adding a method arch_insn_id that returns the corresponding instruction ID enum variants just like the InsnDetail::arch_detail method:
pub enum ArchInsnId {
X86InsnId(X86Insn),
// other architectures
}
impl<'a> Insn<'a> {
pub fn arch_insn_id(&self) -> ArchInsnId {
// ... code
}
}
This leads to another slightly disturbing problem. We have to match against the return value of arch_insn_id to extract the x86-specific instruction ID, given that we are already confident about the architecture. This problem also arises when we call the InsnDetail::arch_detail method or other methods with a similarly-typed return value.
The Proposal
The proposal posted here is only an (too-)early draft and more details may be missing for further considerations and discussions.
First of all, we can add a new trait that abstracts a specific architecture:
pub trait Arch {
type InsnId;
type InsnDetail;
// ... other stuff
}
Then, we add a generic parameter to Capstone, Insn and InsnDetail that represents the architecture:
pub struct Capstone<A: Arch> {
// ... fields
}
pub struct Insn<'a, A: Arch> {
// ... fields
}
pub struct InsnDetail<'a, A: Arch> {
// ... fields
}
Then, the methods mentioned in the motivation section can be typed in a more straight-forward way:
impl<A: Arch> Capstone<A> {
pub fn insn_detail<'s, 'i: 's>(
&'s self,
insn: &'i Insn<'_, A>
) -> CsResult<InsnDetail<'i, A>> {
// ... code
}
}
impl<'a, A: Arch> Insn<'a, A> {
pub fn id(&self) -> A::InsnId {
// ... code
}
}
impl<'a, A: Arch> InsnDetail<'a, A> {
pub fn arch_detail(&self) -> A::InsnDetail {
// ... code
}
}
No more matches, as long as we're targeting a specific architecture. Also, beginners can find corresponding architecture-specific implementations just by looking at the typings. The instruction ID problem can be resolved accordingly.
Unresolved Problems
When the target architecture cannot be determined during compile-time (when the disassembler is created by the Capstone::new_raw method), the generic parameters cannot be set to represent specific architecture. To resolve this problem, maybe we need to introduce a DynamicArch that implements Arch and represents the target architecture is determined during runtime.
This proposal may be pre-mature, but I do think that it reveals some (possibly minor) problems.
Motivation
The link between core data types (such as
Capstone,Insn,InsnDetail, etc.) and architecture-specific data types (the data types undercrate::arch) is not that "explicit" on the typing. For example, to get the instruction ID on the x86 arch, we have to code:which is not easier for beginners to catch.
We can simply resolve the problem above by adding a method
arch_insn_idthat returns the corresponding instruction ID enum variants just like theInsnDetail::arch_detailmethod:This leads to another slightly disturbing problem. We have to
matchagainst the return value ofarch_insn_idto extract the x86-specific instruction ID, given that we are already confident about the architecture. This problem also arises when we call theInsnDetail::arch_detailmethod or other methods with a similarly-typed return value.The Proposal
First of all, we can add a new trait that abstracts a specific architecture:
Then, we add a generic parameter to
Capstone,InsnandInsnDetailthat represents the architecture:Then, the methods mentioned in the motivation section can be typed in a more straight-forward way:
No more
matches, as long as we're targeting a specific architecture. Also, beginners can find corresponding architecture-specific implementations just by looking at the typings. The instruction ID problem can be resolved accordingly.Unresolved Problems
When the target architecture cannot be determined during compile-time (when the disassembler is created by the
Capstone::new_rawmethod), the generic parameters cannot be set to represent specific architecture. To resolve this problem, maybe we need to introduce aDynamicArchthat implementsArchand represents the target architecture is determined during runtime.This proposal may be pre-mature, but I do think that it reveals some (possibly minor) problems.