IABSD.fr/src/sys/dev/pv

Branch :


Log

Author Commit Date CI Message
3d7a3484 2025-06-20 14:06:34 pvclock: Fix reading clock, add time sensor * Fix integer overflows during multiplication. This caused time to run at the wrong speed on some machines, depending on tsc frequency. * Increase accuracy by disabling interrupts while reading the clock * Fix the not-TSC_STABLE handling which was broken because it wrongly compared 32 and 64 bit values. (This requires some atomic hackery on i386). * Add a timedelta sensor using the KVM WALL_CLOCK MSR, similar to the sensor in vmmci(4) Partially inspired by an earlier diff by cheloa@ ok kettenis@ mlarkin@
e4811c2f 2025-06-06 07:43:13 Disable TCP checksum offload, cause its broken on newer hyper-v versions. ok sthen, dlg
3161ed12 2025-05-17 08:36:01 vio: Add missing intrmap Fixes compile errors with if_vio when other drivers are removed. From Crystal Kolipe. ok mvs@
6cbbb384 2025-03-31 15:40:22 pvclock: Use VM_PAGE_TO_PHYS as requested by kettenis@
416e3503 2025-03-31 14:43:00 pvclock: Map page unencrypted If SEV is enabled, we need to map the pvclock page as unencrypted / shared with the hypervisor. ok bluhm@ looks good to me hshoexer@
49f05fab 2025-02-24 09:40:01 Refactor LRO turn off code Its easier to turn off LRO via ioctl calls inside of several hardware and pseudo interfaces. Thus, we avoid manipulating internal data structures form the outside and avoid unnecessary reinitializations. Tested by bluhm@ OK bluhm@
bca80959 2025-01-30 07:48:50 vio: Don't set MAC without MAC feature If the VIRTIO_NET_F_MAC feature is not negotiated, don't try to write the mac address registers. Some hypervisors don't like that.
1f33ff36 2025-01-29 14:03:18 virtio: Prefer 1.x over 0.9 Make virtio 1.x the default if the hypervisor offers both 0.9 and 1.x. Version 1.x is required for feature bits >= 32, as in vio(4) multiqueue. Also be more compliant to the virtio 1.x standard, which says "Drivers MUST match any PCI Revision ID value". tested by bluhm@
c6d6c1ee 2025-01-28 19:53:06 vio: set the flowid to the rx queue number We don't currently have anything better and without this, all packets are sent on the same queue for udp forwarding. ok dlg@
29c809d3 2025-01-17 08:58:38 vio(4): Fix TSO without MRG_RXBUF If MRG_RXBUF is not negotiated, Tx mappings should be large enough for max. TSO packets of MAXMCLBYTES. With tweaks from sf@ ok sf@
35393bd4 2025-01-16 10:33:27 constify struct virtio_feature_name Most are already static const. Adjust the rest, too.
e484fee2 2025-01-14 14:32:32 vio: avoid some spurious error messages with qemu In some user mode networking configurations, qemu will offer the VIRTIO_NET_F_CTRL_GUEST_OFFLOADS feature but then reply with an error even if we configure no offload feature. Don't print an error in this case.
8806e5c3 2025-01-14 14:28:38 whitespace fixes
dd246627 2025-01-14 12:30:57 vio: Enable multiqueue We use a single interrupt vector for every rx/tx queue pair. With config and control queue vectors, we need N+2 vectors for N queues. If multi-queue is not available, the old scheme is used with either one vector per virtqueue or one vector for all queues. * virtio: Add API to establish interrupts on specific cpus in child drivers. Also make virtio_pci_setup_msix return proper errno. * virtio_pci: Increase max number of MSIX vectors * vio: Configure multiple queues and allocate proper interrupts. OK bluhm@
4d9ed068 2025-01-09 10:55:22 virtio: Support unused virtqueues We will need this for vio(4) multiqueue, if the hypervisor offers more packet queues than we want to use. The control queue comes after the packet queues. ok bluhm@
01da065f 2025-01-06 14:23:52 wrap long lines no code change
77d0f823 2024-12-20 22:18:27 virtio: Refactor attach logic virtio 1.x requires that all queue setup, including the queue interrupt vector, is done before setting the queue_enable register to 1. This conflicts with how we do things right now: * We implicitly make queue setup in virtio_alloc_vq(), which is called from the child driver attach functions. This also sets queue_enable=1. * Later, we allocate the interrupts and set the queue interrupt vectors in the second half of the virtio transport attach functions. This is a violation of a MUST from the standard and causes problems with some hypervisors, in particular those that have no virtio 0.9 support, which has no such ordering requirements. To fix this: * Move the interrupt allocation to a new virtio_attach_finish() function. This does all queue setup, including the interrupt vectors. * Don't call virtio_setup_queue() in virtio_alloc_vq() anymore. * We can also move the setting of the DRIVER_OK flag into this function. virtio_attach_finish() must be called before using any virtqueue or writing any virtio config register. While there, * also streamline the attach error handling in all drivers. * skip initially setting sc_config_change to NULL, the softc is initialized to 0. ok jan@ tested by bluhm@
ff0ccef3 2024-12-03 19:14:40 vio: Unlock, switch to qstart function Run without kernel lock. Use the network stack functions used for multiqueue, but still only run on one queue. Add a virtio interface for an interrupt barrier. This is the reverted diff plus a missing chunk. Tested by dtucker, bluhm, sf
5e0634cb 2024-11-27 13:26:42 Revert "vio: Unlock" This causes some crashes. Revert for now ok sf@
b5f9b883 2024-11-27 02:38:35 continue enumerating devices if a device is not matched fixes xbf(4) and xnf(4) not attaching on XCP-ng 8.3/Xen 4.17 which has "device/9pfs/" from Joel Knight
3aca4c8c 2024-11-27 02:14:48 zero attach args; return on missing properties will be removed
399b2cc5 2024-11-25 19:30:47 vio: Unlock, switch to qstart function Run without kernel lock. Use the network stack functions used for multiqueue, but still only run on one queue. Add a virtio interface for an interrupt barrier. ok dlg@
095824d5 2024-11-04 15:43:10 viogpu: tune down debug messages viogpu_send_cmd() is called for every framebuffer update. Only print log message at VIRTIO_DEBUG >= 3 to avoid log spam.
122bebd4 2024-11-04 15:41:09 vio_dump: Fix control queue dump If the device has just been reset (e.g. after ifconfig down), we cannot query the feature bits. Make sure we dump the control virtqueue even in this case.
3c9ccb50 2024-10-03 08:59:49 vio: Increase rx mbuf size with lro bluhm found that using bigger rx mbufs helps tcp splice performance if lro is enabled. Use 4k in that case. Also fix confusion in rx dmamap segment count. Even with lro/tso, we only put unfragmented mbufs into the rx queue. Therefore we only need max. 2 segments, one for the mbuf and one for the separate header for legacy virtio devices. OK bluhm@
b7c7c6b0 2024-09-19 06:23:38 vio: allow longer tx chains When TCP segmentation offload is supported, we may get larger packets with more dma segments. Allocate more segments in the busdma_map in this case, so that we need to defragment less often. ok jan@
101c378c 2024-09-17 09:00:14 vio: Reduce code duplication in control queue handling Pull the common parts of all the control queue operations into separate functions. While there, avoid setting sc_ctrl_inuse FREE if it was RESET, except in vio_stop. Doing so could lead to more race conditions. ok bluhm@
9bb3f5fb 2024-09-14 09:21:13 pvclock.h not needed
80192087 2024-09-04 09:12:55 vio: put enqueue and dmasync into a function Before we enqueue with VIO_DMAMEM_ENQUEUE(), we always sync with VIO_DMAMEM_SYNC(). In order to reduce verbosity, create a function that does both. ok bluhm@
9593dc34 2024-09-04 07:54:51 Fix some spelling. Input and ok jmc@, jsg@
e203d0a4 2024-09-04 06:36:33 vio: style fixes ok bluhm@
319dd760 2024-09-04 06:34:08 vio: Re-arrange structs for multi-queue Move per-queue data structures into a new struct vio_queue and adjust mem allocation. Only one queue is allocated for now. ok bluhm@
0e6436ed 2024-09-02 08:26:26 virtio: Move interrupt setup into separate function Put the MSIX vector into struct virtqueue and create a transport specific function that feeds the vectors to the device. This will allow child devices to influence which vectors are used for which virtqueues. This will be used by multi-queue vio(4) to route corresponding rx/tx queue interrupts to the same cpu. The setup_intrs() function also sets the config interrupt MSIX vector which fixes a bug that virtio_pci_set_msix_config_vector() would not be called after a device reset. OK bluhm@
cd07c705 2024-08-28 12:40:22 vio: Fix allocation sizes For both rx and tx, we need an array of bus_dmamap_t and mbuf pointers each. This results in a size of (rxqsize + txqsize) * (sizeof(bus_dmamap_t) + sizeof(struct mbuf *)) The factor 2 before the sizeof(bus_dmamap_t) was too much and we allocated more than we needed. OK bluhm@
d6ed892b 2024-08-27 19:11:20 vio: whitespace and message tweaks Fix whitespace. Other network drivers use a comma in the boot message. Make more clear what features are meant in an error message OK bluhm@
8cd32577 2024-08-27 19:01:11 constify struct virtio_ops OK bluhm@
fdf28b39 2024-08-27 18:44:12 virtio: Remove some unused leftovers Some fields in struct virtqueue are unused. The maxsegsize argument to virtio_alloc_vq is unused. OK bluhm@
dc275227 2024-08-26 19:37:54 virtio: Introduce dedicated attach args Instead of abusing virtio_softc as attach args, create a separate struct. Use it to pass the number of available interrupts. This will be useful for vio(4) multi-queue support. ok jan@
310c43a6 2024-08-26 19:24:02 vio(4): Fix hardmtu without MRG_RXBUF Without VIRTIO_NET_F_MRG_RXBUF, we cannot chain several buffers together and we can only receive packets up to the length of the buffers we put into the ring. OK bluhm@
61df4475 2024-08-19 00:03:12 pvbus_activate does nothing except call config_activate_children (4 possible cases). it does not need to exist. encoding NULL into the cfattach structure does the same thing.
9ec2faa3 2024-08-16 13:02:44 vio(4): Don't set IPv4 checksum OK flag for rx packets. The virtio specification just address TCP/UDP checksum offloading. Thus, we have to check the IPv4 checksum in our stack. ok sf@
657921cb 2024-08-13 08:47:28 Sync full virtqueue on device reset We initialize the whole virtqueue and must make sure that the device sees this even for the areas that are normally only written by the device. Otherwise there may be an assertion fail during ifconfig up, as found by bluhm@ with hshoexer@'s bounce buffer diff. OK bluhm@
d3638ce2 2024-08-01 11:13:19 virtio: Fix dmamap_sync calls Add some missing bus_dmamap_sync calls, noticed with SEV and based on an earlier diff by hshoexer@. Some of the required syncing is done in virtio_check_vq(). Make sure to use that function instead of calling call the virtqueue done function directly from device specific drivers. For viogpu this means that we cannot poll with virtio_dequeue() but must use virtio_check_vq() instead. To make this more clear, rename viogpu_vq_wait() into viogpu_vq_done(). While there, set the DRIVER_OK flag even earlier. It must be set before using any virtqueue. ok kettenis@
6354d80b 2024-07-26 07:55:23 virtio: add/fix feature bits There was a off-by-one in unused vioblk feature defines. Fix this. Add missing feature bits from virtio 1.2 so that they are displayed with VIRTIO_DEBUG. ok jan@
2175e0d6 2024-07-26 06:29:01 vio: Don't request csum offload if not negotiated The standard says "A driver MUST NOT enable an offload for which the appropriate feature has not been negotiated." ok jan@
50bdd322 2024-07-25 08:35:40 virtio: Allow more verbose debugging If VIRTIO_DEBUG is set to 2, dump the whole virtqueues.
c36fc2a0 2024-07-23 19:14:05 virtio: fix comment
de518290 2024-06-28 14:46:31 Cleanup control queue checks in vio(4). Add missing newlines in prints while here. ok sf@
da5607f6 2024-06-26 01:40:49 return type on a dedicated line when declaring functions ok mglocker@
1b7d34a5 2024-06-10 19:26:17 Use TCP Large Receive Offload in vio(4). Also introduce the guest offload feature to turn LRO off/on. Tested by Mark Patruck, sf@ and bluhm@ ok sf@ and bluhm@
bd2924d5 2024-06-10 18:21:59 Clarify panic strings in vio(4) suggested by bluhm ok bluhm
1a9be797 2024-06-09 16:25:27 Introduce IFCAP_VLAN_HWOFFLOAD for vio(4). Add IFCAP_VLAN_HWOFFLOAD to signal hardware like vio(4) can handle checksum or TSO offloading with inline VLAN tags. tested by Mark Patruck, sf@ and bluhm@ ok sf@ and bluhm@
33bec486 2024-06-04 09:51:52 vio(4): remove useless casts and fix spacing ok sf@
73d9c7e8 2024-05-28 12:11:26 vio(4): fix jumbo frames vio_rx_offload() was called too early. So, the consistency checks of ether_extact() cause wrong packet detection and wrong checkums. also tested by bluhm ok bluhm@
b4155af8 2024-05-24 10:05:55 remove unneeded includes
68e76eaf 2024-05-17 16:37:10 vio: Fix signal handling and locking in sysctl path Commits f0b002d01d5 "Release the netlock when sleeping for control messages in in vioioctl()" and 126b881f71 "Insert a workaround for per-ifp ioctl being called w/o NET_LOCK()." in vio(4) fixed a deadlock but may cause a crash with a protection fault trap if addresses are added/removed concurrently. The actual issue is that signals are not handled correctly while sleeping. After a signal, there is a race condition where sc_ctrl_inuse is first set to FREE and then the interrupt handler sets it to DONE, causing a hang in the next vio_wait_ctrl() call. To fix it: * Revert the NET_LOCK unlocking work-around. * Remove PCATCH from the sleep call when we wait for control queue, avoiding the race with vio_ctrleof(). To ensure that we don't hang forever, use a 5 second timeout. * If the timeout is hit, or if the hypervisor has set the DEVICE_NEEDS_RESET status bit, do not try to use the control queue until the next ifconfig down/up which resets the device. * In order to allow reading the device status from device drivers, add a new interface to the virtio transport drivers. * Avoid a crash if there is outgoing traffic while doing ifconfig down. OK bluhm@
3b372c34 2024-05-14 08:26:13 remove prototypes with no matching function
0f9e9ec2 2024-05-13 01:15:50 remove prototypes with no matching function ok mpi@
7f8ccd64 2024-05-07 18:35:23 Additional check for TSO packets with 0 MSS. Tested by bluhm ok bluhm@
3315a92c 2024-04-10 19:55:50 Implement TCP Segmentation Offload for vio(4) Tested by Brian Conway and bluhm With tweaks from bluhm ok bluhm
ac5f541a 2024-02-14 22:41:48 Check IP length in ether_extract_headers(). For LRO with ix(4) it is necessary to detect ethernet padding. Extract ip_len and ip6_plen from the mbuf and provide it to the drivers. Add extended sanitity checks, like IP packet is shorter than TCP header. This prevents offloading to network hardware with bougus packets. Also iphlen of extracted headers contains header length for IPv4 and IPv6, to make code in drivers simpler. OK mglocker@
e78a66e5 2024-02-13 13:58:19 Analyse header layout in ether_extract_headers(). Several drivers need IPv4 header length and TCP offset for checksum offload, TSO and LRO. Accessing these fields directly caused crashes on sparc64 due to misaligned access. It cannot be guaranteed that IP and TCP header is 4 byte aligned in driver level. Also gcc 4.2.1 assumes that bit fields can be accessed with 32 bit load instructions. Use memcpy() in ether_extract_headers() to get the bits from IPv4 and TCP header and store the header length in struct ether_extracted. From there network drivers can esily use it without caring about alignment and bit shift. Do some sanity checks with the length values to prevent that invalid values from evil packets get stored into hardware registers. If check fails, clear the pointer to the header to hide it from the driver. Add debug prints that help to figure out the reason for bad packets and provide information when debugging drivers. OK mglocker@
8dedfb50 2024-01-04 01:32:06 Revert previous. splx(9) can call kvp_get_ip_info() from any place with netlock held and cause recursive lock acquisition issue.
fbb4acec 2023-12-20 09:51:06 vio(4): checksum offloading for TCP/UDP in IPv6 Packets ok sf@
7d925a4a 2023-12-11 09:40:42 vio(4): simplify mbuf parsing with ether_extract_headers() ok sf@
9b4cc2ec 2023-12-09 10:36:05 vio(4) add recv TCP/UDP checksum offloading tested on Linux/KVM tested on proxmox and vultr by florian ok florian
c0ea3bd1 2023-12-02 10:01:35 virtio: Fix handling of feature bits >= 32 Fix handling of feature bits >= 32. This does not yet affect any driver as no high feature bit besides VERSION_1 is used, and that one has special handling. Also, with VIRTIO_DEBUG, simply walk through all transport and device feature names, so that we don't need to adjust the if clause whenever the standard introduces new transport features. ok jan@ bluhm@
cf96265b 2023-11-10 15:51:19 Make ifq and ifiq interface MP safe. Rename ifq_set_maxlen() to ifq_init_maxlen(). This function neither uses WRITE_ONCE() nor a mutex and is called before the ifq mutex is initialized. The new name expresses that it should be used only during interface attach when there is no concurrency. Protect ifq_len(), ifq_empty(), ifiq_len(), and ifiq_empty() with READ_ONCE(). They can be used without lock as they only read a single integer. OK dlg@
4ac41129 2023-11-08 12:01:21 Allow Xen to use backing store devices with 4K-byte sectors. Problem reported and much testing by Christian Kujau. Thanks! Roughly equivalent to bouyer@NetBSD changes prompted by Christian. ok mlarkin@ dlg@
b510f7b4 2023-09-26 08:30:13 Use shared netlock to protect ifnet data within vmt_tclo_broadcastip(). Execute vmt_tclo_tick() timeout handler in process context to allow context switch within vmt_tclo_broadcastip(). ok yasuoka
60c1ec16 2023-09-23 13:01:12 Use shared netlock to protect if_list and ifa_list walkthrough and ifnet data access within kvp_get_ip_info(). ok bluhm
5932474c 2023-07-28 16:54:48 Initialize handlers with NULL, not 0.
6c89734d 2023-07-07 10:23:39 The per-VQ MSI-X interrupt handler needs to sync DMA mappings in the same way that the shared interrupt handler does. This is one of the requirements of virtio_dequeue(), as specified in its comment above. Without the DMA sync, it will not see a new entry on the ring and return. Since the interrupt is edge-triggered there won't be another one and we'll get stuck. ok dv@
bc3c2f61 2023-07-05 18:23:10 The hypercall page populated with instructions by the hypervisor is not IBT compatible due to lack of endbr64. Replace the indirect call with a new hv_hypercall_trampoline() routine which jumps to the hypercall page without any indirection. Allows me to boot OpenBSD using Hyper-V on Windows 11 again. ok guenther@
253b36e6 2023-07-03 07:40:52 typofix lladdr in function names; OK deraadt jan
cdd24841 2023-05-29 08:13:35 virtio: Set DRIVER_OK earlier The DRIVER_OK bit must be set before using any virt-queues. To allow virtio device drivers to use the virt-queues in their attach functions, set the bit there and not in the virtio transport attach function. Only vioscsi and viogpu really need this, but let's only have one standard way to do this. Noticed because of hangs with vioscsi on qemu/windows and in the Oracle cloud. With much debugging help by Aaron Mason. Also revert vioscsi.c 1.31 "Temporarily workaround double calls into vioscsi_req_done()" ok krw@
801ddbd0 2023-04-27 13:52:58 Temporarily workaround double calls into vioscsi_req_done() causing NULL de-reference. Reported, initial patch and tests by Antun Matanovic. Thanks! ok miod@
022e208a 2023-04-23 10:29:35 Stop setting ri->ri_bs to prevent a panic caused by rasops accessing its uninitialized content. When we rasops_init() with RI_VCONS, a new screen is allocated. If ri->ri_bs is set, this will be copied. Otherwise a new one will be allocated and filled with ASCII spaces. Copying the ri->ri_bs is useful in case we have an early console which contents we want to keep. As we do not have an early console here, there's no point in setting it at the moment. With this my Hetzner arm64 VM doesn't panic anymore. ok jcs@ kettenis@
e208b562 2023-04-20 19:28:30 add viogpu, a VirtIO GPU driver works enough to get a console on qemu with more work to come from others feedback from miod ok patrick
b3af768d 2023-04-11 00:45:06 fix double words in comments feedback and ok jmc@ miod, ok millert@
24ee467d 2023-02-04 19:19:35 timecounting: remove incomplete PPS support The timecounting code has had stubs for pulse-per-second (PPS) polling since it was imported in 2004. At this point it seems unlikely that anyone is going to finish adding PPS support, so let's remove the stubs: - Delete the dead tc_poll_pps() call from tc_windup(). - Remove all tc_poll_pps symbols from the kernel. Link: https://marc.info/?l=openbsd-tech&m=167519035723210&w=2 ok miod@
98dbb30e 2023-01-07 06:40:21 The maximum length of the value is extended to 64k bytes. ok yasuoka
529a7364 2022-12-28 10:11:36 Change space character to TAB. ok tb mlarkin giovanni
0b965e09 2022-12-26 04:09:14 Add close vmt.
a962230c 2022-12-08 05:45:36 Fix pvbus to specify M_ZERO properly. ok kn mvs mlarkin asou deraadt
45d281fc 2022-12-03 10:57:04 Modify vmt to use the buffer allocated in pvbus directly instead of the buffer in the vmt softc when doing RPC for PVBUSIOC_KV{READ|WRITE} ioctl. ok asou
1ce42776 2022-11-10 02:47:52 Return error number instead of call panic(). ok mpi@
ccb45f8e 2022-09-08 10:22:05 Rename global ifnet TAILQ Naming the list like the struct itself makes for awful grepping. Call the global variable "ifnetlist" from now on. There used to be kvm(3) consumers in base picking up this symbol, but those have long been converted to other interfaces. A few potential ports users remain, same deal as sys/net/if_var.h r1.116 "Remove struct ifnet's unused if_switchport member": they get bumped. Previous users pointed out by deraadt OK bluhm
07c7d596 2022-08-29 02:08:13 use ansi volatile keyword, not __volatile__ ok miod@ guenther@
da893f17 2022-08-25 17:38:16 amd64, i386: use delay_init() instead of writing delay_func by hand Now that we have delay_init(), use it in all the places where we currently set delay_func by hand. lapic_delay() is great: 3000. hv_delay() is needed before we set up lapic_delay() on Hyper-V guests: 4000. tsc_delay() is better than lapic_delay() and (probably?) hv_delay(): 5000. We may bump hv_delay's quality value up over that of tsc_delay() in a future patch. It's a little ambiguous whether hv_delay() causes a VM exit. Idea and patch from jsg@. With tons of input, research and advice from jsg@. Link: https://marc.info/?l=openbsd-tech&m=166053729104923&w=2 ok mlarkin@ jsg@
a454aff3 2022-04-16 19:19:58 constify SCSI adapter entry points ok krw@
bbf448d2 2022-03-23 13:03:36 KASSERT() that an id read from a descriptor is valid before using it as an index into an array. Reported by Demi Marie Obenour of Invisible Things Lab. feedback and ok jmatthew@
527beed4 2022-03-07 18:52:16 vio(4): use NULL instead of 0 with sc_{rx,tx}_mbuf pointer array. ok millert@, deraadt@
4b1a56af 2022-01-09 05:42:36 spelling feedback and ok tb@ jmc@ ok ratchov@
ffaee248 2021-11-05 11:38:29 Constify struct cfattach.
04c68813 2021-10-24 09:16:53 pretty & normalize the cfdriver decl
69b036cd 2021-08-31 15:52:10 When running on Hyper-V, make use of its timecounter as delay func in case we're still using the i8254 for that. On Hyper-V Gen 2 VMs there is no i8254 we can trust, so we need some kind of fallback, especially if there is no TSC either. Discussed with the hackroom ok kettenis@
4a58382d 2021-07-26 11:06:36 fix an mbuf leak with m_len 0 mbufs from niklas@ via mikeb@
1ab70bb7 2021-07-26 06:00:37 Add mtx_enter/mtx_leave in kvp_pool_keys(). ok mikeb
e977d71d 2021-06-11 12:47:15 Drop received packets unless IFF_RUNNING is set. When hvn(4) attaches it sends commands and waits for replies to come back in. The receive pipe seems to contain both command completions and data packets. When waiting for command completion during hvn(4) attach, it's possible for packets to show up as well. We shouldn't call if_input() if hvn(4) is not set up, so drop them when we're not running. ok mikeb@